Syntax Conventions

    A string is a sequence of characters, enclosed within either single quote () or double quote (") characters. Examples:

    Usages of string literals:

    • Values of TEXT type data in INSERT or SELECT statements
    • Full Java class names in UDF and trigger management statements
    • Attribute fields (including attribute keys and attribute values) in UDF / trigger execution or management statements
    • File paths in LOAD / REMOVE / SETTLE statements
    • Password fields in user management statements

    There are several ways to include quote characters within a string:

    • Precede the quote character by an escape character (\).
    • ' inside a string quoted with " needs no special treatment and need not be doubled or escaped. In the same way, " inside a string quoted with ' needs no special treatment.

    The following examples demonstrate how quoting and escaping work:

    1. '"string"' // "string"
    2. '""string""' // ""string""
    3. 'str\'ing' // str'ing
    4. '\'string' // 'string
    5. "string" // string
    6. "'string'" // 'string'
    7. "''string''" // ''string''
    8. "str\"ing" // str"ing
    9. "\"string" // "string

    Numeric Literals

    Number literals include integer (exact-value) literals and floating-point (approximate-value) literals.

    Integers are represented as a sequence of digits. Numbers may be preceded by - or + to indicate a negative or positive value, respectively. Examples: 1, -1.

    Numbers with fractional part or represented in scientific notation with a mantissa and exponent are approximate-value numbers. Examples: .1, , -2.23, +1.70, 1.2E3, 1.2E-3, -1.2E3, -1.2E-3.

    The INT32 and INT64 data types are integer types and calculations are exact.

    The FLOAT and DOUBLE data types are floating-point types and calculations are approximate.

    The timestamp is the time point at which data is produced. It includes absolute timestamps and relative timestamps in IoTDB. For information about timestamp support in IoTDB, see .

    Specially, NOW() represents a constant timestamp that indicates the system time at which the statement began to execute.

    Boolean Literals

    The constants TRUE and FALSE evaluate to 1 and 0, respectively. The constant names can be written in any lettercase.

    The NULL value means “no data.” NULL can be written in any lettercase.

    Identifiers

    Certain objects within IoTDB, including TRIGGER, FUNCTION(UDF), CONTINUOUS QUERY, SCHEMA TEMPLATE, USER, ROLE and other object names are known as identifiers.

    What you need to know about identifiers:

    • Permitted characters in unquoted identifiers:
      • [0-9 a-z A-Z _ : @ # $ { }] (letters, digits, some special characters)
      • [‘\u2E80’..’\u9FFF’] (UNICODE Chinese characters)
    • Identifiers may begin with a digit and consist solely of digits, which is not recommended!
    • Identifiers are case sensitive.
    • Note: User and role names are not case-sensitive, and special characters are not allowed to be escaped.

    If an identifier contains special characters or is a keyword, you must quote it whenever you refer to it. The identifier quote character is the backtick (`):

    We call the part of a path divided by . as a node.

    The constraints of node names are almost the same as the identifiers, but you should note the following points:

    • root is a reserved word, and it is only allowed to appear at the beginning layer of the time series. If root appears in other layers, it cannot be parsed and an error will be reported.
    • Character . is not permitted in unquoted or quoted node names. If you must do it (even if it is not recommended), you can enclose it within either single quote () or double quote ("). In this case, quotes are recognized as part of the node name to avoid ambiguity.
    • Among the node name enclosed in the reverse quota, single quotes and double quotes need to use a backslash to escape.
    • In particular, if the system is deployed on a Windows machine, the storage group layer name will be case-insensitive. For example, creating both root.ln and root.LN at the same time is not allowed.
    1. CREATE TIMESERIES root.a.b.s1+s2/s3.c WITH DATATYPE=INT32, ENCODING=RLE
    2. // invalid!
    3. // root.a.b.`s1+s2/s3`.c will be parsed as Path[root, a, b, s1+s2/s3, c]
    1. CREATE TIMESERIES root.a.b.`s1.s2`.c WITH DATATYPE=INT32, ENCODING=RLE
    2. // invalid!
    3. CREATE TIMESERIES root.a.b."s1.s2".c WITH DATATYPE=INT32, ENCODING=RLE
    4. // root.a.b."s1.s2".c will be parsed as Path[root, a, b, "s1.s2", c]

    Keywords and Reserved Words

    Keywords are words that have significance in SQL require special treatment for use as identifiers and node names, and need to be escaped with backticks. Certain keywords, such as TIME and ROOT, are reserved and cannot use as identifiers and node names (even after escaping).

    shows the keywords and reserved words in IoTDB 0.13.

    IoTDB supports the execution of arbitrary nested expressions consisting of numbers, time series, arithmetic expressions, and time series generating functions (including user-defined functions) in the select clause.

    Note: Node names that consist solely of digits, ' and " in an expression must be enclosed in backticks (`).

    1. -- There exists timeseries: root.sg.d.0, root.sg.d.'a' and root.sg."d".b
    2. select 0 from root.sg.d -- ambiguity exists, parsing failed
    3. select 'a' from root.sg.d -- ambiguity exists, parsing failed
    4. select "d".b from root.sg -- ambiguity exists, parsing failed
    5. select `0` from root.sg.d -- query from root.sg.d.0
    6. select `0` + 0 from root.sg.d -- valid expression, add number 0 to each point of root.sg.d.0
    7. select myudf(`'a'`, 'x') from root.sg.d -- valid expression, call function myudf with timeseries root.sg.d.'a' as the 1st parameter, and a string constant 'x' as the 2nd parameter

    Quote Symbol

    Double quotes (“), single quotes (‘)

    Double quotes and single quotes are used in the following scenarios:

    1. String literals are represented by strings enclosed in single or double quotes.
    2. If you want to use the path separator (.) in the path node name, you need to enclose the node name in single or double quotes. In this case, to avoid ambiguity, the quotes are treated as part of the node name by the system.

    Backticks are used in the following scenarios:

    1. When using special characters in an identifier, the identifier needs to be enclosed in backticks.
    2. When using special characters other than path separators in the path node name, the path node name needs to be enclosed in backticks. In this case, the backticks are not considered part of the node name by the system.

    Backslash (\)

    backslashes are used in the following scenarios:

    • In string literals, double or single quote should be escaped with a backslash.
      • e.g. “str\“ing” is parsed as str”ing, ‘str\‘ing’ is parsed as str’ing.
    • In an identifier, backtick should be escaped with a backslash.
      • e.g. `na\`me` is parsed as na`me.
    • In path node names, double or single quote should be escaped with a backslash. To avoid ambiguity, backslashes are recognized as part of the node name.

    Please read the lexical and grammar description files in our code repository:

    Lexical file: antlr/src/main/antlr4/org/apache/iotdb/db/qp/sql/IoTDBSqlLexer.g4