Lexical structure v16
There are several aspects to the lexical structure of SQL:
SQL input consists of a sequence of commands.
A command is composed of a sequence of tokens, terminated by a semicolon (;). The end of the input stream also terminates a command.
The valid tokens depend on the syntax of the command.
A token can be a key word, an identifier, a quoted identifier, a literal (or constant), or a special character symbol. Tokens are normally separated by whitespace (space, tab, new line) but don't need to be if there's no ambiguity (which is generally the case only if a special character is adjacent to some other token type).
Comments can occur in SQL input. They aren't tokens; they are equivalent to whitespace.
For example, the following is syntactically valid SQL input:
This is a sequence of three commands, one per line, although that format isn't required. You can enter more than one command on a line, and commands can usually split across lines.
The SQL syntax isn't very consistent regarding the tokens that identify commands and the ones that are operands or parameters. The first few tokens are generally the command name, so the example contains a SELECT
, an UPDATE
, and an INSERT
command. But, for instance, the UPDATE
command always requires a SET
token to appear in a certain position, and this variation of INSERT
also requires a VALUES
token to be complete. The precise syntax rules for each command are described in SQL reference.