To read a token, first skip all blanks, tabs, newlines, carriage returns,
vertical tabs, form feeds, comments, and pragmas. Then read the longest
sequence of characters that forms an operator or an Id
or
Literal
.
An Id
is a case-significant sequence of letters, digits, and
underscores that begins with a letter. An Id
is a keyword if it
appears in the list of keywords, a reserved identifier if it appears in the
list of reserved identifiers, and an ordinary identifier otherwise.
In the following grammar, terminals are characters surrounded by double-quotes
and the special terminal DQUOTE
represents double-quote itself.
Id = Letter {Letter | Digit | "_"}. Literal = Number | CharLiteral | TextLiteral. CharLiteral = "'" (PrintingChar | Escape | DQUOTE) "'". TextLiteral = DQUOTE {PrintingChar | Escape | "'"} DQUOTE. Escape = "\" "n" | "\" "t" | "\" "r" | "\" "f" | "\" "\" | "\" "'" | "\" DQUOTE | "\" OctalDigit OctalDigit OctalDigit. Number = Digit {Digit} | Digit {Digit} "_" HexDigit {HexDigit} | Digit {Digit} "." Digit {Digit} [Exp]. Exp = ("E" | "e" | "D" | "d" | "X" | "x") ["+" | "-"] Digit {Digit}. PrintingChar = Letter | Digit | OtherChar. HexDigit = Digit | "A" | "B" | "C" | "D" | "E" | "F" | "a" | "b" | "c" | "d" | "e" | "f". Digit = "0" | "1" | ... | "9". OctalDigit = "0" | "1" | ... | "7". Letter = "A" | "B" | ... | "Z" | "a" | "b" | ... | "z". OtherChar = " " | "!" | "#" | "$" | "%" | "&" | "(" | ")" | "*" | "+" | "," | "-" | "." | "/" | ":" | ";" | "<" | "=" | ">" | "?" | "@" | "[" | "]" | "^" | "_" | "`" | "{" | "|" | "}" | "~" | ExtendedChar ExtendedChar = any char with ISO-Latin-1 code in [8_240..8_377].