|2. List of Keys|
|3. How to use Regular Expressions|
|4. Sample Language Definition|
The language definitions can be found in the file Languages.ini that is located in the program's root directory. In most cases this will be C:\Programs\Source2Html\. There is a section named [Language%] for each language definition with % being an integer number. Be aware that these numbers must start from 1 and that the program stops reading as soon as the first integer number is missing.
In the table below you find a list of all valid keys of a language definition. The default value is used if a key is missing.
|Name||Name of the language||none - must be given|
|Symbols||List of all legal symbols||empty|
|Extensions||List typical file extensions; each one seperated by a ";"||empty|
|CaseSensitive||Is the language case sensitive? Relevant when trying to match the word lists||true|
|AllowWhiteAfterFirst||Allow whitespaces after first matching character of preprocessor directives||true|
|String||Characters that start/end a string||deactivated|
|Character||Characters that start/end character chain||deactivated|
|SillyStringHandling||Allow multiline strings without escape character; maybe obsolete in future releases||false|
|EscapeChar||Escape character in strings or character chains||deactivated|
|SingleLineComment||Single line comment starting sequence||empty|
|MultiLineComment||Multiline comment starting and ending sequences||empty|
|Preprocessor||Preprocessor word list||empty|
|Words%||User defined word list with % being an integer starting from 1||empty|
|Words%Type||Code item type used to colorize matching words||none, must be specified|
|Words%ExchangeType||Code item type used to colorize following identifers (identifier type exchange)||none|
|Words%EndSequence||Sequence that ends identifier type exchange||none|
|Regex%||Regular expression with % being an integer starting from 1||empty|
|Regex%Type||Code item type used to colorize matching strings||empty|
|Regex%ExchangeType||Same as above||none|
|Regex%EndSequence||Same as above||none|
You may use regular expression syntax (regex) to identify code items. You specify a regex string using the key Regex% with % beeing an integer number starting from 1. The item type that is identified by Regex% is defined by the key Regex%Type. Valid item types are Comment, Keyword, Identifier, Symbol, String, Number, Character, Preprocessor, Custom1, Custom2, Custom3 and Custom4.
A unique character in the regex string stands for exactly this character. Furthermore you may use the codes of the following table to demand one or an arbitrary number of alphas and/or numbers and/or whitespaces or characters. Arbitrary means any number including zero. It's not allowed to have two arbitrary codes after eachother. To interpret a backslash as a unique character you need to use an additional backslash as escape sequence, i.e. "\\"
|\0||one arbitrary character|
|\3||one number or whitespace|
|\5||one alpha or whitespace|
|\6||one alpha or number|
|\7||one alpha or number or whitespace|
|\8||arbitrary number of arbitrary characters|
|\9||arbitrary number of whitespace|
|\A||arbitrary number of numbers|
|\B||arbitrary number of numbers or whitespace|
|\C||arbitrary number of alpha|
|\D||arbitrary number of alpha or whitespace|
|\E||arbitrary number of alpha or numbers|
|\F||arbitrary number of alpha or numbers or whitespace|
The following table contains some examples of useful regex strings.
|Regex String||Explanation||Example string|
|"\\\4\C"||A backslash followed by one or more alphas||"\Section"|
|"\\%"||A backslash followed by a "%"||"\%"|
|"#\2\A"||A "#" followed by one or more numbers||"#97"|
|"&\4\C;"||A "&" followed by one or more characters and ended by a ";"||" "|
|"<!--\8-->"||A "<!--" followed by an arbitrary number of characters and ended by "-->"||"<-- HTML-Comment -->"|
Below you find an annotated sample definition for an artificial language that is mainly a mixture of C/C++, LaTex and HTML.
last updated: 06 January 2005 © 2000-2005 by Lars Haendel