summaryrefslogtreecommitdiffstats
path: root/doc/Scintilla/Lexer.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/Scintilla/Lexer.txt')
-rw-r--r--doc/Scintilla/Lexer.txt226
1 files changed, 0 insertions, 226 deletions
diff --git a/doc/Scintilla/Lexer.txt b/doc/Scintilla/Lexer.txt
deleted file mode 100644
index 9d4ab50..0000000
--- a/doc/Scintilla/Lexer.txt
+++ /dev/null
@@ -1,226 +0,0 @@
-How to write a scintilla lexer
-
-A lexer for a particular language determines how a specified range of
-text shall be colored. Writing a lexer is relatively straightforward
-because the lexer need only color given text. The harder job of
-determining how much text actually needs to be colored is handled by
-Scintilla itself, that is, the lexer's caller.
-
-
-Parameters
-
-The lexer for language LLL has the following prototype:
-
- static void ColouriseLLLDoc (
- unsigned int startPos, int length,
- int initStyle,
- WordList *keywordlists[],
- Accessor &styler);
-
-The styler parameter is an Accessor object. The lexer must use this
-object to access the text to be colored. The lexer gets the character
-at position i using styler.SafeGetCharAt(i);
-
-The startPos and length parameters indicate the range of text to be
-recolored; the lexer must determine the proper color for all characters
-in positions startPos through startPos+length.
-
-The initStyle parameter indicates the initial state, that is, the state
-at the character before startPos. States also indicate the coloring to
-be used for a particular range of text.
-
-Note: the character at StartPos is assumed to start a line, so if a
-newline terminates the initStyle state the lexer should enter its
-default state (or whatever state should follow initStyle).
-
-The keywordlists parameter specifies the keywords that the lexer must
-recognize. A WordList class object contains methods that make simplify
-the recognition of keywords. Present lexers use a helper function
-called classifyWordLLL to recognize keywords. These functions show how
-to use the keywordlists parameter to recognize keywords. This
-documentation will not discuss keywords further.
-
-
-The lexer code
-
-The task of a lexer can be summarized briefly: for each range r of
-characters that are to be colored the same, the lexer should call
-
- styler.ColourTo(i, state)
-
-where i is the position of the last character of the range r. The lexer
-should set the state variable to the coloring state of the character at
-position i and continue until the entire text has been colored.
-
-Note 1: the styler (Accessor) object remembers the i parameter in the
-previous calls to styler.ColourTo, so the single i parameter suffices to
-indicate a range of characters.
-
-Note 2: As a side effect of calling styler.ColourTo(i,state), the
-coloring states of all characters in the range are remembered so that
-Scintilla may set the initStyle parameter correctly on future calls to
-the
-lexer.
-
-
-Lexer organization
-
-There are at least two ways to organize the code of each lexer. Present
-lexers use what might be called a "character-based" approach: the outer
-loop iterates over characters, like this:
-
- lengthDoc = startPos + length ;
- for (unsigned int i = startPos; i < lengthDoc; i++) {
- chNext = styler.SafeGetCharAt(i + 1);
- << handle special cases >>
- switch(state) {
- // Handlers examine only ch and chNext.
- // Handlers call styler.ColorTo(i,state) if the state changes.
- case state_1: << handle ch in state 1 >>
- case state_2: << handle ch in state 2 >>
- ...
- case state_n: << handle ch in state n >>
- }
- chPrev = ch;
- }
- styler.ColourTo(lengthDoc - 1, state);
-
-
-An alternative would be to use a "state-based" approach. The outer loop
-would iterate over states, like this:
-
- lengthDoc = startPos+lenth ;
- for ( unsigned int i = startPos ;; ) {
- char ch = styler.SafeGetCharAt(i);
- int new_state = 0 ;
- switch ( state ) {
- // scanners set new_state if they set the next state.
- case state_1: << scan to the end of state 1 >> break ;
- case state_2: << scan to the end of state 2 >> break ;
- case default_state:
- << scan to the next non-default state and set new_state >>
- }
- styler.ColourTo(i, state);
- if ( i >= lengthDoc ) break ;
- if ( ! new_state ) {
- ch = styler.SafeGetCharAt(i);
- << set state based on ch in the default state >>
- }
- }
- styler.ColourTo(lengthDoc - 1, state);
-
-This approach might seem to be more natural. State scanners are simpler
-than character scanners because less needs to be done. For example,
-there is no need to test for the start of a C string inside the scanner
-for a C comment. Also this way makes it natural to define routines that
-could be used by more than one scanner; for example, a scanToEndOfLine
-routine.
-
-However, the special cases handled in the main loop in the
-character-based approach would have to be handled by each state scanner,
-so both approaches have advantages. These special cases are discussed
-below.
-
-Special case: Lead characters
-
-Lead bytes are part of DBCS processing for languages such as Japanese
-using an encoding such as Shift-JIS. In these encodings, extended
-(16-bit) characters are encoded as a lead byte followed by a trail byte.
-
-Lead bytes are rarely of any lexical significance, normally only being
-allowed within strings and comments. In such contexts, lexers should
-ignore ch if styler.IsLeadByte(ch) returns TRUE.
-
-Note: UTF-8 is simpler than Shift-JIS, so no special handling is
-applied for it. All UTF-8 extended characters are >= 128 and none are
-lexically significant in programming languages which, so far, use only
-characters in ASCII for operators, comment markers, etc.
-
-
-Special case: Folding
-
-Folding may be performed in the lexer function. It is better to use a
-separate folder function as that avoids some troublesome interaction
-between styling and folding. The folder function will be run after the
-lexer function if folding is enabled. The rest of this section explains
-how to perform folding within the lexer function.
-
-During initialization, lexers that support folding set
-
- bool fold = styler.GetPropertyInt("fold");
-
-If folding is enabled in the editor, fold will be TRUE and the lexer
-should call:
-
- styler.SetLevel(line, level);
-
-at the end of each line and just before exiting.
-
-The line parameter is simply the count of the number of newlines seen.
-It's initial value is styler.GetLine(startPos) and it is incremented
-(after calling styler.SetLevel) whenever a newline is seen.
-
-The level parameter is the desired indentation level in the low 12 bits,
-along with flag bits in the upper four bits. The indentation level
-depends on the language. For C++, it is incremented when the lexer sees
-a '{' and decremented when the lexer sees a '}' (outside of strings and
-comments, of course).
-
-The following flag bits, defined in Scintilla.h, may be set or cleared
-in the flags parameter. The SC_FOLDLEVELWHITEFLAG flag is set if the
-lexer considers that the line contains nothing but whitespace. The
-SC_FOLDLEVELHEADERFLAG flag indicates that the line is a fold point.
-This normally means that the next line has a greater level than present
-line. However, the lexer may have some other basis for determining a
-fold point. For example, a lexer might create a header line for the
-first line of a function definition rather than the last.
-
-The SC_FOLDLEVELNUMBERMASK mask denotes the level number in the low 12
-bits of the level param. This mask may be used to isolate either flags
-or level numbers.
-
-For example, the C++ lexer contains the following code when a newline is
-seen:
-
- if (fold) {
- int lev = levelPrev;
-
- // Set the "all whitespace" bit if the line is blank.
- if (visChars == 0)
- lev |= SC_FOLDLEVELWHITEFLAG;
-
- // Set the "header" bit if needed.
- if ((levelCurrent > levelPrev) && (visChars > 0))
- lev |= SC_FOLDLEVELHEADERFLAG;
- styler.SetLevel(lineCurrent, lev);
-
- // reinitialize the folding vars describing the present line.
- lineCurrent++;
- visChars = 0; // Number of non-whitespace characters on the line.
- levelPrev = levelCurrent;
- }
-
-The following code appears in the C++ lexer just before exit:
-
- // Fill in the real level of the next line, keeping the current flags
- // as they will be filled in later.
- if (fold) {
- // Mask off the level number, leaving only the previous flags.
- int flagsNext = styler.LevelAt(lineCurrent);
- flagsNext &= ~SC_FOLDLEVELNUMBERMASK;
- styler.SetLevel(lineCurrent, levelPrev | flagsNext);
- }
-
-
-Don't worry about performance
-
-The writer of a lexer may safely ignore performance considerations: the
-cost of redrawing the screen is several orders of magnitude greater than
-the cost of function calls, etc. Moreover, Scintilla performs all the
-important optimizations; Scintilla ensures that a lexer will be called
-only to recolor text that actually needs to be recolored. Finally, it
-is not necessary to avoid extra calls to styler.ColourTo: the sytler
-object buffers calls to ColourTo to avoid multiple updates of the
-screen.
-
-Page contributed by Edward K. Ream \ No newline at end of file