2.8.12 Token productions

To read a token, first skip all blanks, tabs, newlines, carriage returns, vertical tabs, form feeds, comments, and pragmas. Then read the longest sequence of characters that forms an operator or an Id or Literal.

An Id is a case-significant sequence of letters, digits, and underscores that begins with a letter. An Id is a keyword if it appears in the list of keywords, a reserved identifier if it appears in the list of reserved identifiers, and an ordinary identifier otherwise.

In the following grammar, terminals are characters surrounded by double-quotes and the special terminal DQUOTE represents double-quote itself.

Id = Letter {Letter | Digit | "_"}.

Literal = Number | CharLiteral | TextLiteral.

CharLiteral = "'"  (PrintingChar | Escape | DQUOTE) "'".

TextLiteral = DQUOTE {PrintingChar | Escape | "'"} DQUOTE.

Escape = "\" "n"   | "\" "t"     | "\" "r"     | "\" "f"
       | "\" "\"   | "\" "'"     | "\" DQUOTE
       | "\" OctalDigit OctalDigit OctalDigit.

Number = Digit {Digit}
       | Digit {Digit} "_" HexDigit {HexDigit}
       | Digit {Digit} "." Digit {Digit} [Exp].

Exp = ("E" | "e" | "D" | "d" | "X" | "x") ["+" | "-"] Digit {Digit}.

PrintingChar = Letter | Digit | OtherChar.

HexDigit = Digit | "A" | "B" | "C" | "D" | "E" | "F"
                 | "a" | "b" | "c" | "d" | "e" | "f".

Digit = "0" | "1" | ... | "9".

OctalDigit = "0" | "1" | ... | "7".

Letter = "A"  | "B"  | ... | "Z"  | "a"  | "b"  | ... | "z".

OtherChar = " " | "!" | "#" | "$" | "%" | "&" | "(" | ")"
          | "*" | "+" | "," | "-" | "." | "/" | ":" | ";"
          | "<" | "=" | ">" | "?" | "@" | "[" | "]" | "^"
          | "_" | "`" | "{" | "|" | "}" | "~"
          | ExtendedChar

ExtendedChar = any char with ISO-Latin-1 code in [8_240..8_377].
