Preprocessor Commands

+KDiff3 supports two preprocessor options. +

Preprocessor-Command:: + When any file is read, it will be piped through this external command. + The output of this command will be visible instead of the original file. + You can write your own preprocessor that fulfills your specific needs. + Use this to cut away disturbing parts of the file, or to automatically + correct the indentation etc. +
Line-Matching Preprocessor-Command:: + When any file is read, it will be piped through this external command. If + a preprocessor-command (see above) is also specified, then the output of the + preprocessor is the input of the line-matching preprocessor. + The output will only be used during the line matching phase of the analysis. + You can write your own preprocessor that fulfills your specific needs. + Each input line must have a corresponding output line. +

+The idea is to allow the user greater flexibility while configuring the diff-result. +But this requires an external program, and many users don't want to write one themselves. +The good news is that very often sed or perl +will do the job. +

Example: Simple testcase: Consider file a.txt (6 lines): +

+      aa
+      ba
+      ca
+      da
+      ea
+      fa
+

+And file b.txt (3 lines): +

+      cg
+      dg
+      eg
+

+Without a preprocessor the following lines would be placed next to each other: +

+      aa - cg
+      ba - dg
+      ca - eg
+      da
+      ea
+      fa
+

+This is probably not wanted since the first letter contains the actually interesting information. +To help the matching algorithm to ignore the second letter we can use a line matching preprocessor +command, that replaces 'g' with 'a': +

+   sed 's/g/a/'
+

+With this command the result of the comparison would be: +

+      aa
+      ba
+      ca - cg
+      da - dg
+      ea - eg
+      fa
+

+Internally the matching algorithm sees the files after running the line matching preprocessor, +but on the screen the file is unchanged. (The normal preprocessor would change the data also on +the screen.) +

sed Basics

+This section only introduces some very basic features of sed. For more +information see info:/sed or + +http://www.gnu.org/software/sed/manual/html_mono/sed.html. +A precompiled version for Windows can be found at +http://unxutils.sourceforge.net. +Note that the following examples assume that the sed-command is in some +directory in the PATH-environment variable. If this is not the case, you have to specify the full absolute +path for the command. +

Note

Also note that the following examples use the single quotation mark (') which won't work for Windows. +On Windows you should use the double quotation marks (") instead.

+In this context only the sed-substitute-command is used: +

+   sed 's/REGEXP/REPLACEMENT/FLAGS'
+

+Before you use a new command within KDiff3, you should first test it in a console. +Here the echo-command is useful. Example: +

+   echo abrakadabra | sed 's/a/o/'
+   -> obrakadabra
+

+This example shows a very simple sed-command that replaces the first occurance +of "a" with "o". If you want to replace all occurances then you need the "g"-flag: +

+   echo abrakadabra | sed 's/a/o/g'
+   -> obrokodobro
+

+The "|"-symbol is the pipe-command that transfers the output of the previous +command to the input of the following command. If you want to test with a longer file +then you can use cat on Unix-like systems or type +on Windows-like systems. sed will do the substitution for each line. +

+   cat filename | sed options
+

Examples For sed-Use In KDiff3

Ignoring Other Types Of Comments

+Currently KDiff3 understands only C/C++ comments. Using the +Line-Matching-Preprocessor-Command you can also ignore +other types of comments, by converting them into C/C++-comments. + +Example: To ignore comments starting with "#", you would like to convert them +to "//". Note that you also must enable the "Ignore C/C++-Comments" option to get +an effect. An appropriate Line-Matching-Preprocessor-Command would be: + +

+   sed 's/#/\/\//'
+

+Since for sed the "/"-character has a special meaning, it is necessary to place the +"\"-character before each "/" in the replacement-string. Sometimes the "\" is required +to add or remove a special meaning of certain characters. The single quotation marks (') before +and after the substitution-command are important now, because otherwise the shell will +try to interpret some special characters like '#', '$' or '\' before passing them to +sed. Note that on Windows you will need the double quotation marks (") here. Windows +substitutes other characters like '%', so you might have to experiment a little bit. +

Caseinsensitive Diff

+Use the following Line-Matching-Preprocessor-Command to convert all input to uppercase: +

+   sed 's/\(.*\)/\U\1/'
+

+Here the ".*" is a regular expression that matches any string and in this context matches +all characters in the line. +The "\1" in the replacement string refers to the matched text within the first pair of "$" and "$". +The "\U" converts the inserted text to uppercase. +

Ignoring Version Control Keywords

+CVS and other version control systems use several keywords to insert automatically +generated strings (info:/cvs/Keyword substitution). +All of them follow the pattern "$KEYWORD generated text$". We now need a +Line-Matching-Preprocessor-Command that removes only the generated text: +

+   sed 's/\$\(Revision\|Author\|Log\|Header\|Date\).*\$/\$\1\$/'
+

+The "\|" separates the possible keywords. You might want to modify this list +according to your needs. +The "\" before the "$" is necessary because otherwise the "$" matches the end of the line. +

+While experimenting with sed you might come to understand and even like +these regular expressions. They are useful because there are many other programs that also +support similar things. +

Ignoring Numbers

+Ignoring numbers actually is a built-in option. But as another example, this is how +it would look as a Line-Matching-Preprocessor-command. +

+   sed 's/[0123456789.-]//g'
+

+Any character within '[' and ']' is a match and will be replaced with nothing. +

Ignoring Certain Columns

+Sometimes a text is very strictly formatted, and contains columns that you always want to ignore, while there are +other columns you want to preserve for analysis. In the following example the first five columns (characters) are +ignored, the next ten columns are preserved, then again five columns are ignored and the rest of the line is preserved. +

+   sed 's/.....\(..........\).....\(.*\)/\1\2/'
+

+Each dot '.' matches any single character. The "\1" and "\2" in the replacement string refer to the matched text within the first +and second pair of "$" and "$" denoting the text to be preserved. +

Combining Several Substitutions

+Sometimes you want to apply several substitutions at once. You can then use the +semicolon ';' to separate these from each other. Example: +

+   echo abrakadabra | sed 's/a/o/g;s/\(.*\)/\U\1/'
+   -> OBROKODOBRO
+

Using perl instead of sed

+Instead of sed you might want to use something else like +perl. +

+   perl -p -e 's/REGEXP/REPLACEMENT/FLAGS'
+

+But some details are different in perl. Note that where +sed needed "$" and "$" perl +requires the simpler "(" and ")" without preceding '\'. Example: +

+   sed 's/\(.*\)/\U\1/'
+   perl -p -e 's/(.*)/\U\1/'
+

Order Of Preprocessor Execution

+The data is piped through all internal and external preprocessors in the +following order: +

Normal preprocessor,
Line-Matching-Preprocessor,
Ignore case (conversion to uppercase),
Detection of C/C++ comments,
Ignore numbers,
Ignore white space

+The data after the normal preprocessor will be preserved for display and merging. The +other operations only modify the data that the line-matching-diff-algorithm sees. +

+In the rare cases where you use a normal preprocessor note that +the line-matching-preprocessor sees the output of the normal preprocessor as input. +

Warning

+The preprocessor-commands are often very useful, but as with any option that modifies +your texts or hides away certain differences automatically, you might accidentally overlook +certain differences and in the worst case destroy important data. +

+For this reason during a merge if a normal preprocessor-command is being used KDiff3 +will tell you so and ask you if it should be disabled or not. +But it won't warn you if a Line-Matching-Preprocessor-command is active. The merge will not complete until +all conflicts are solved. If you disabled "Show White Space" then the differences that +were removed with the Line-Matching-Preprocessor-command will also be invisible. If the +Save-button remains disabled during a merge (because of remaining conflicts), make sure to enable +"Show White Space". If you don't wan't to merge these less important differences manually +you can select "Choose [A|B|C] For All Unsolved White space Conflicts" in the Merge-menu. +