diff options
Diffstat (limited to 'doc/Scintilla/Lexer.txt')
-rw-r--r-- | doc/Scintilla/Lexer.txt | 226 |
1 files changed, 0 insertions, 226 deletions
diff --git a/doc/Scintilla/Lexer.txt b/doc/Scintilla/Lexer.txt deleted file mode 100644 index 9d4ab50..0000000 --- a/doc/Scintilla/Lexer.txt +++ /dev/null @@ -1,226 +0,0 @@ -How to write a scintilla lexer - -A lexer for a particular language determines how a specified range of -text shall be colored. Writing a lexer is relatively straightforward -because the lexer need only color given text. The harder job of -determining how much text actually needs to be colored is handled by -Scintilla itself, that is, the lexer's caller. - - -Parameters - -The lexer for language LLL has the following prototype: - - static void ColouriseLLLDoc ( - unsigned int startPos, int length, - int initStyle, - WordList *keywordlists[], - Accessor &styler); - -The styler parameter is an Accessor object. The lexer must use this -object to access the text to be colored. The lexer gets the character -at position i using styler.SafeGetCharAt(i); - -The startPos and length parameters indicate the range of text to be -recolored; the lexer must determine the proper color for all characters -in positions startPos through startPos+length. - -The initStyle parameter indicates the initial state, that is, the state -at the character before startPos. States also indicate the coloring to -be used for a particular range of text. - -Note: the character at StartPos is assumed to start a line, so if a -newline terminates the initStyle state the lexer should enter its -default state (or whatever state should follow initStyle). - -The keywordlists parameter specifies the keywords that the lexer must -recognize. A WordList class object contains methods that make simplify -the recognition of keywords. Present lexers use a helper function -called classifyWordLLL to recognize keywords. These functions show how -to use the keywordlists parameter to recognize keywords. This -documentation will not discuss keywords further. - - -The lexer code - -The task of a lexer can be summarized briefly: for each range r of -characters that are to be colored the same, the lexer should call - - styler.ColourTo(i, state) - -where i is the position of the last character of the range r. The lexer -should set the state variable to the coloring state of the character at -position i and continue until the entire text has been colored. - -Note 1: the styler (Accessor) object remembers the i parameter in the -previous calls to styler.ColourTo, so the single i parameter suffices to -indicate a range of characters. - -Note 2: As a side effect of calling styler.ColourTo(i,state), the -coloring states of all characters in the range are remembered so that -Scintilla may set the initStyle parameter correctly on future calls to -the -lexer. - - -Lexer organization - -There are at least two ways to organize the code of each lexer. Present -lexers use what might be called a "character-based" approach: the outer -loop iterates over characters, like this: - - lengthDoc = startPos + length ; - for (unsigned int i = startPos; i < lengthDoc; i++) { - chNext = styler.SafeGetCharAt(i + 1); - << handle special cases >> - switch(state) { - // Handlers examine only ch and chNext. - // Handlers call styler.ColorTo(i,state) if the state changes. - case state_1: << handle ch in state 1 >> - case state_2: << handle ch in state 2 >> - ... - case state_n: << handle ch in state n >> - } - chPrev = ch; - } - styler.ColourTo(lengthDoc - 1, state); - - -An alternative would be to use a "state-based" approach. The outer loop -would iterate over states, like this: - - lengthDoc = startPos+lenth ; - for ( unsigned int i = startPos ;; ) { - char ch = styler.SafeGetCharAt(i); - int new_state = 0 ; - switch ( state ) { - // scanners set new_state if they set the next state. - case state_1: << scan to the end of state 1 >> break ; - case state_2: << scan to the end of state 2 >> break ; - case default_state: - << scan to the next non-default state and set new_state >> - } - styler.ColourTo(i, state); - if ( i >= lengthDoc ) break ; - if ( ! new_state ) { - ch = styler.SafeGetCharAt(i); - << set state based on ch in the default state >> - } - } - styler.ColourTo(lengthDoc - 1, state); - -This approach might seem to be more natural. State scanners are simpler -than character scanners because less needs to be done. For example, -there is no need to test for the start of a C string inside the scanner -for a C comment. Also this way makes it natural to define routines that -could be used by more than one scanner; for example, a scanToEndOfLine -routine. - -However, the special cases handled in the main loop in the -character-based approach would have to be handled by each state scanner, -so both approaches have advantages. These special cases are discussed -below. - -Special case: Lead characters - -Lead bytes are part of DBCS processing for languages such as Japanese -using an encoding such as Shift-JIS. In these encodings, extended -(16-bit) characters are encoded as a lead byte followed by a trail byte. - -Lead bytes are rarely of any lexical significance, normally only being -allowed within strings and comments. In such contexts, lexers should -ignore ch if styler.IsLeadByte(ch) returns TRUE. - -Note: UTF-8 is simpler than Shift-JIS, so no special handling is -applied for it. All UTF-8 extended characters are >= 128 and none are -lexically significant in programming languages which, so far, use only -characters in ASCII for operators, comment markers, etc. - - -Special case: Folding - -Folding may be performed in the lexer function. It is better to use a -separate folder function as that avoids some troublesome interaction -between styling and folding. The folder function will be run after the -lexer function if folding is enabled. The rest of this section explains -how to perform folding within the lexer function. - -During initialization, lexers that support folding set - - bool fold = styler.GetPropertyInt("fold"); - -If folding is enabled in the editor, fold will be TRUE and the lexer -should call: - - styler.SetLevel(line, level); - -at the end of each line and just before exiting. - -The line parameter is simply the count of the number of newlines seen. -It's initial value is styler.GetLine(startPos) and it is incremented -(after calling styler.SetLevel) whenever a newline is seen. - -The level parameter is the desired indentation level in the low 12 bits, -along with flag bits in the upper four bits. The indentation level -depends on the language. For C++, it is incremented when the lexer sees -a '{' and decremented when the lexer sees a '}' (outside of strings and -comments, of course). - -The following flag bits, defined in Scintilla.h, may be set or cleared -in the flags parameter. The SC_FOLDLEVELWHITEFLAG flag is set if the -lexer considers that the line contains nothing but whitespace. The -SC_FOLDLEVELHEADERFLAG flag indicates that the line is a fold point. -This normally means that the next line has a greater level than present -line. However, the lexer may have some other basis for determining a -fold point. For example, a lexer might create a header line for the -first line of a function definition rather than the last. - -The SC_FOLDLEVELNUMBERMASK mask denotes the level number in the low 12 -bits of the level param. This mask may be used to isolate either flags -or level numbers. - -For example, the C++ lexer contains the following code when a newline is -seen: - - if (fold) { - int lev = levelPrev; - - // Set the "all whitespace" bit if the line is blank. - if (visChars == 0) - lev |= SC_FOLDLEVELWHITEFLAG; - - // Set the "header" bit if needed. - if ((levelCurrent > levelPrev) && (visChars > 0)) - lev |= SC_FOLDLEVELHEADERFLAG; - styler.SetLevel(lineCurrent, lev); - - // reinitialize the folding vars describing the present line. - lineCurrent++; - visChars = 0; // Number of non-whitespace characters on the line. - levelPrev = levelCurrent; - } - -The following code appears in the C++ lexer just before exit: - - // Fill in the real level of the next line, keeping the current flags - // as they will be filled in later. - if (fold) { - // Mask off the level number, leaving only the previous flags. - int flagsNext = styler.LevelAt(lineCurrent); - flagsNext &= ~SC_FOLDLEVELNUMBERMASK; - styler.SetLevel(lineCurrent, levelPrev | flagsNext); - } - - -Don't worry about performance - -The writer of a lexer may safely ignore performance considerations: the -cost of redrawing the screen is several orders of magnitude greater than -the cost of function calls, etc. Moreover, Scintilla performs all the -important optimizations; Scintilla ensures that a lexer will be called -only to recolor text that actually needs to be recolored. Finally, it -is not necessary to avoid extra calls to styler.ColourTo: the sytler -object buffers calls to ColourTo to avoid multiple updates of the -screen. - -Page contributed by Edward K. Ream
\ No newline at end of file |