Syntax Conventions
A string is a sequence of characters, enclosed within either single quote () or double quote ("
) characters. Examples:
Usages of string literals:
- Values of
TEXT
type data inINSERT
orSELECT
statements - Full Java class names in UDF and trigger management statements
- Attribute fields (including attribute keys and attribute values) in UDF / trigger execution or management statements
- File paths in
LOAD
/REMOVE
/SETTLE
statements - Password fields in user management statements
There are several ways to include quote characters within a string:
- Precede the quote character by an escape character (\).
'
inside a string quoted with"
needs no special treatment and need not be doubled or escaped. In the same way,"
inside a string quoted with'
needs no special treatment.
The following examples demonstrate how quoting and escaping work:
'"string"' // "string"
'""string""' // ""string""
'str\'ing' // str'ing
'\'string' // 'string
"string" // string
"'string'" // 'string'
"''string''" // ''string''
"str\"ing" // str"ing
"\"string" // "string
Numeric Literals
Number literals include integer (exact-value) literals and floating-point (approximate-value) literals.
Integers are represented as a sequence of digits. Numbers may be preceded by -
or +
to indicate a negative or positive value, respectively. Examples: 1
, -1
.
Numbers with fractional part or represented in scientific notation with a mantissa and exponent are approximate-value numbers. Examples: .1
, , -2.23
, +1.70
, 1.2E3
, 1.2E-3
, -1.2E3
, -1.2E-3
.
The INT32
and INT64
data types are integer types and calculations are exact.
The FLOAT
and DOUBLE
data types are floating-point types and calculations are approximate.
The timestamp is the time point at which data is produced. It includes absolute timestamps and relative timestamps in IoTDB. For information about timestamp support in IoTDB, see .
Specially, NOW()
represents a constant timestamp that indicates the system time at which the statement began to execute.
Boolean Literals
The constants TRUE
and FALSE
evaluate to 1 and 0, respectively. The constant names can be written in any lettercase.
The NULL
value means “no data.” NULL
can be written in any lettercase.
Identifiers
Certain objects within IoTDB, including TRIGGER
, FUNCTION
(UDF), CONTINUOUS QUERY
, SCHEMA TEMPLATE
, USER
, ROLE
and other object names are known as identifiers.
What you need to know about identifiers:
- Permitted characters in unquoted identifiers:
- [0-9 a-z A-Z _ : @ # $ { }] (letters, digits, some special characters)
- [‘\u2E80’..’\u9FFF’] (UNICODE Chinese characters)
- Identifiers may begin with a digit and consist solely of digits, which is not recommended!
- Identifiers are case sensitive.
- Note: User and role names are not case-sensitive, and special characters are not allowed to be escaped.
If an identifier contains special characters or is a keyword, you must quote it whenever you refer to it. The identifier quote character is the backtick (`):
We call the part of a path divided by .
as a node
.
The constraints of node names are almost the same as the identifiers, but you should note the following points:
root
is a reserved word, and it is only allowed to appear at the beginning layer of the time series. Ifroot
appears in other layers, it cannot be parsed and an error will be reported.- Character
.
is not permitted in unquoted or quoted node names. If you must do it (even if it is not recommended), you can enclose it within either single quote () or double quote ("
). In this case, quotes are recognized as part of the node name to avoid ambiguity. - Among the node name enclosed in the reverse quota, single quotes and double quotes need to use a backslash to escape.
- In particular, if the system is deployed on a Windows machine, the storage group layer name will be case-insensitive. For example, creating both
root.ln
androot.LN
at the same time is not allowed.
CREATE TIMESERIES root.a.b.s1+s2/s3.c WITH DATATYPE=INT32, ENCODING=RLE
// invalid!
// root.a.b.`s1+s2/s3`.c will be parsed as Path[root, a, b, s1+s2/s3, c]
CREATE TIMESERIES root.a.b.`s1.s2`.c WITH DATATYPE=INT32, ENCODING=RLE
// invalid!
CREATE TIMESERIES root.a.b."s1.s2".c WITH DATATYPE=INT32, ENCODING=RLE
// root.a.b."s1.s2".c will be parsed as Path[root, a, b, "s1.s2", c]
Keywords and Reserved Words
Keywords are words that have significance in SQL require special treatment for use as identifiers and node names, and need to be escaped with backticks. Certain keywords, such as TIME and ROOT, are reserved and cannot use as identifiers and node names (even after escaping).
shows the keywords and reserved words in IoTDB 0.13.
IoTDB supports the execution of arbitrary nested expressions consisting of numbers, time series, arithmetic expressions, and time series generating functions (including user-defined functions) in the select
clause.
Note: Node names that consist solely of digits, '
and "
in an expression must be enclosed in backticks (`).
-- There exists timeseries: root.sg.d.0, root.sg.d.'a' and root.sg."d".b
select 0 from root.sg.d -- ambiguity exists, parsing failed
select 'a' from root.sg.d -- ambiguity exists, parsing failed
select "d".b from root.sg -- ambiguity exists, parsing failed
select `0` from root.sg.d -- query from root.sg.d.0
select `0` + 0 from root.sg.d -- valid expression, add number 0 to each point of root.sg.d.0
select myudf(`'a'`, 'x') from root.sg.d -- valid expression, call function myudf with timeseries root.sg.d.'a' as the 1st parameter, and a string constant 'x' as the 2nd parameter
Quote Symbol
Double quotes (“), single quotes (‘)
Double quotes and single quotes are used in the following scenarios:
- String literals are represented by strings enclosed in single or double quotes.
- If you want to use the path separator (
.
) in the path node name, you need to enclose the node name in single or double quotes. In this case, to avoid ambiguity, the quotes are treated as part of the node name by the system.
Backticks are used in the following scenarios:
- When using special characters in an identifier, the identifier needs to be enclosed in backticks.
- When using special characters other than path separators in the path node name, the path node name needs to be enclosed in backticks. In this case, the backticks are not considered part of the node name by the system.
Backslash (\)
backslashes are used in the following scenarios:
- In string literals, double or single quote should be escaped with a backslash.
- e.g. “str\“ing” is parsed as str”ing, ‘str\‘ing’ is parsed as str’ing.
- In an identifier, backtick should be escaped with a backslash.
- e.g. `na\`me` is parsed as na`me.
- In path node names, double or single quote should be escaped with a backslash. To avoid ambiguity, backslashes are recognized as part of the node name.
Please read the lexical and grammar description files in our code repository:
Lexical file: antlr/src/main/antlr4/org/apache/iotdb/db/qp/sql/IoTDBSqlLexer.g4