Juan,
let me try to answer some of these questions:
- you can probably start using something that already exists, such as
http://diotavelli.net/PyQtWiki/Python%20syntax%20highlighting or Pygments
http://pygments.org/,
https://bitbucket.org/birkenfeld/pygments-main/src/f6d0af39cb77/pygments/lexers/agile.py?at=default . IDLE, spyder, emacs, eclipse have all builtin syntax highlighting for Python that can give you an idea of how your implementation might look like, see screenshot. Personally, I like the IDLE style a lot.
- The official Python documentation is
http://www.python.org/doc/. The language reference
http://docs.python.org/2/reference/ chapter 2 explains the lexical structure
http://docs.python.org/2/reference/lexical_analysis.html including comments, keywords, operators, (string) and other rules in a very concise way.
- Python gives some _* or _*_ identifiers special semantics (such as being local members only), and some of them are used for predefined functions
http://docs.python.org/2/reference/datamodel.html#specialnames. Usually, they are not specially highlighted.
- Function definition
http://docs.python.org/2/reference/compound_stmts.html#function-definitions. In a nutshell:
def funcName(param1,param2,...):
code 1
code 2
- line comments: #
- no block comments. There is something special called doc-strings that are string literals that immediately follow a function or class definition. Syntactically they are string, but they are relevant only for documention (e.g. help(object) command). They are usually highlighted as a string (which they are), not as a comment. Example:
class myClass(baseClass):
"""This is a multiline comment explaining my class
details line1
details line2
"""
- Indentation: This is special in Python. Instead of using {} or begin/end, code blocks are created by indentation after the ":" of certain statements such as def, class, if, while, for, ... The rules are simple: When the lines ends with ":", indent everything that belongs to the code block with a consistent number of white spaces. Go back to the previous indentation when the code block ends. This is very similar to what most people do in C, tcl, ... anyway, but without the explicit {} or begin/end. And it is mandatory to indicate a block this way, there is just no other means. I know that many people dont like it, but others say it improves readability.
-From the syntax highlight point of view, you can ignore indentation.
-You can support editing by automatically increasing the indent level after a ":" at the end of the line.
-Should you want to implement source folding: Follow the block definition given above.
- No preprocessor
- Other: Not much:
- You may want to highlight class/function decorators
http://docs.python.org/2/glossary.html#term-decorator - Some, but not all, tools also highlight the most basic identifiers from Pythons standard library. They are described in the Standard Library documentation
http://docs.python.org/2/library/index.html, chapter 1-6.
Let me know if I can help.
Georg