Expressions

    Expressions are used in various places in the native query language, including virtual columns and . They are also generated by most Druid SQL functions during the process.

    This expression language supports the following operators (listed in decreasing order of precedence).

    Long, double, and string data types are supported. If a number contains a dot, it is interpreted as a double, otherwise it is interpreted as a long. That means, always add a ‘.’ to your number if you want it interpreted as a double value. String literals should be quoted by single quotation marks.

    Expressions can contain variables. Variable names may contain letters, digits, ‘_‘ and ‘$’. Variable names must not begin with a digit. To escape other special characters, you can quote it with double quotation marks.

    For logical operators, a number is true if and only if it is positive (0 or negative value means false). For string type, it’s the evaluation result of ‘Boolean.valueOf(string)’.

    Multi-value string dimensions are supported and may be treated as either scalar or array typed values. When treated as a scalar type, an expression will automatically be transformed to apply the scalar operation across all values of the multi-valued type, to mimic Druid’s native behavior. Values that result in arrays will be coerced back into the native Druid string type for aggregation. Druid aggregations on multi-value string dimensions on the individual values, not the ‘array’, behaving similar to the UNNEST operator available in many SQL dialects. However, by using the array_to_string function, aggregations may be done on a stringified version of the complete array, allowing the complete row to be preserved. Using string_to_array in an expression post-aggregator, allows transforming the stringified dimension back into the true native array type.

    namedescription
    castcast(expr,’LONG’ or ‘DOUBLE’ or ‘STRING’ or ‘LONG_ARRAY’, or ‘DOUBLE_ARRAY’ or ‘STRING_ARRAY’) returns expr with specified type. exception can be thrown. Scalar types may be cast to array types and will take the form of a single element list (null will still be null).
    ifif(predicate,then,else) returns ‘then’ if ‘predicate’ evaluates to a positive number, otherwise it returns ‘else’
    nvlnvl(expr,expr-for-null) returns ‘expr-for-null’ if ‘expr’ is null (or empty string for string type)
    likelike(expr, pattern[, escape]) is equivalent to SQL expr LIKE pattern
    case_searchedcase_searched(expr1, result1, [[expr2, result2, …], else-result])
    case_simplecase_simple(expr, value1, result1, [[value2, result2, …], else-result])
    bloom_filter_testbloom_filter_test(expr, filter) tests the value of ‘expr’ against ‘filter’, a bloom filter serialized as a base64 string. See documentation for additional details.

    String functions

    namedescription
    concatconcat(expr, expr…) concatenate a list of strings
    formatformat(pattern[, args…]) returns a string formatted in the manner of Java’s .
    likelike(expr, pattern[, escape]) is equivalent to SQL expr LIKE pattern
    lookuplookup(expr, lookup-name) looks up expr in a registered query-time lookup
    parse_longparse_long(string[, radix]) parses a string as a long with the given radix, or 10 (decimal) if a radix is not provided.
    regexp_extractregexp_extract(expr, pattern[, index]) applies a regular expression pattern and extracts a capture group index, or null if there is no match. If index is unspecified or zero, returns the substring that matched the pattern. The pattern may match anywhere inside expr; if you want to match the entire string instead, use the ^ and $ markers at the start and end of your pattern.
    regexp_likeregexp_like(expr, pattern) returns whether expr matches regular expression pattern. The pattern may match anywhere inside expr; if you want to match the entire string instead, use the ^ and $ markers at the start and end of your pattern.
    contains_stringcontains_string(expr, string) returns whether expr contains string as a substring. This method is case-sensitive.
    icontains_stringcontains_string(expr, string) returns whether expr contains string as a substring. This method is case-insensitive.
    replacereplace(expr, pattern, replacement) replaces pattern with replacement
    substringsubstring(expr, index, length) behaves like java.lang.String’s substring
    rightright(expr, length) returns the rightmost length characters from a string
    leftleft(expr, length) returns the leftmost length characters from a string
    strlenstrlen(expr) returns length of a string in UTF-16 code units
    strposstrpos(haystack, needle[, fromIndex]) returns the position of the needle within the haystack, with indexes starting from 0. The search will begin at fromIndex, or 0 if fromIndex is not specified. If the needle is not found then the function returns -1.
    trimtrim(expr[, chars]) remove leading and trailing characters from expr if they are present in chars. chars defaults to ‘ ‘ (space) if not provided.
    ltrimltrim(expr[, chars]) remove leading characters from expr if they are present in chars. chars defaults to ‘ ‘ (space) if not provided.
    rtrimrtrim(expr[, chars]) remove trailing characters from expr if they are present in chars. chars defaults to ‘ ‘ (space) if not provided.
    lowerlower(expr) converts a string to lowercase
    upperupper(expr) converts a string to uppercase
    reversereverse(expr) reverses a string
    repeatrepeat(expr, N) repeats a string N times
    lpadlpad(expr, length, chars) returns a string of length from expr left-padded with chars. If length is shorter than the length of expr, the result is expr which is truncated to length. The result will be null if either expr or chars is null. If chars is an empty string, no padding is added, however expr may be trimmed if necessary.
    rpadrpad(expr, length, chars) returns a string of length from expr right-padded with . If length is shorter than the length of expr, the result is expr which is truncated to length. The result will be null if either expr or chars is null. If chars is an empty string, no padding is added, however expr may be trimmed if necessary.

    Time functions

    See javadoc of java.lang.Math for detailed explanation for each function.

    namedescription
    absabs(x) would return the absolute value of x
    acosacos(x) would return the arc cosine of x
    asinasin(x) would return the arc sine of x
    atanatan(x) would return the arc tangent of x
    atan2atan2(y, x) would return the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta)
    cbrtcbrt(x) would return the cube root of x
    ceilceil(x) would return the smallest (closest to negative infinity) double value that is greater than or equal to x and is equal to a mathematical integer
    copysigncopysign(x) would return the first floating-point argument with the sign of the second floating-point argument
    coscos(x) would return the trigonometric cosine of x
    coshcosh(x) would return the hyperbolic cosine of x
    cotcot(x) would return the trigonometric cotangent of an angle x
    divdiv(x,y) is integer division of x by y
    expexp(x) would return Euler’s number raised to the power of x
    expm1expm1(x) would return e^x-1
    floorfloor(x) would return the largest (closest to positive infinity) double value that is less than or equal to x and is equal to a mathematical integer
    getExponentgetExponent(x) would return the unbiased exponent used in the representation of x
    hypothypot(x, y) would return sqrt(x^2+y^2) without intermediate overflow or underflow
    loglog(x) would return the natural logarithm of x
    log10log10(x) would return the base 10 logarithm of x
    log1plog1p(x) would the natural logarithm of x + 1
    maxmax(x, y) would return the greater of two values
    minmin(x, y) would return the smaller of two values
    nextafternextafter(x, y) would return the floating-point number adjacent to the x in the direction of the y
    nextUpnextUp(x) would return the floating-point value adjacent to x in the direction of positive infinity
    pipi would return the constant value of the π
    powpow(x, y) would return the value of the x raised to the power of y
    remainderremainder(x, y) would return the remainder operation on two arguments as prescribed by the IEEE 754 standard
    rintrint(x) would return value that is closest in value to x and is equal to a mathematical integer
    roundround(x, y) would return the value of the x rounded to the y decimal places. While x can be an integer or floating-point number, y must be an integer. The type of the return value is specified by that of x. y defaults to 0 if omitted. When y is negative, x is rounded on the left side of the y decimal points. If x is NaN, x will return 0. If x is infinity, x will be converted to the nearest finite double.
    scalbscalb(d, sf) would return d 2^sf rounded as if performed by a single correctly rounded floating-point multiply to a member of the double value set
    signumsignum(x) would return the signum function of the argument x
    sinsin(x) would return the trigonometric sine of an angle x
    sinhsinh(x) would return the hyperbolic sine of x
    sqrtsqrt(x) would return the correctly rounded positive square root of x
    tantan(x) would return the trigonometric tangent of an angle x
    tanhtanh(x) would return the hyperbolic tangent of x
    todegreestodegrees(x) converts an angle measured in radians to an approximately equivalent angle measured in degrees
    toradianstoradians(x) converts an angle measured in degrees to an approximately equivalent angle measured in radians
    ulpulp(x) would return the size of an ulp of the argument x

    Array functions

    functiondescription
    array(expr1,expr …)constructs an array from the expression arguments, using the type of the first argument as the output array type
    array_length(arr)returns length of array expression
    array_offset(arr,long)returns the array element at the 0 based index supplied, or null for an out of range index
    array_ordinal(arr,long)returns the array element at the 1 based index supplied, or null for an out of range index
    array_contains(arr,expr)returns 1 if the array contains the element specified by expr, or contains all elements specified by expr if expr is an array, else 0
    array_overlap(arr1,arr2)returns 1 if arr1 and arr2 have any elements in common, else 0
    array_offset_of(arr,expr)returns the 0 based index of the first occurrence of expr in the array, or -1 or null if druid.generic.useDefaultValueForNull=falseif no matching elements exist in the array.
    array_ordinal_of(arr,expr)returns the 1 based index of the first occurrence of expr in the array, or -1 or null if druid.generic.useDefaultValueForNull=false if no matching elements exist in the array.
    array_prepend(expr,arr)adds expr to arr at the beginning, the resulting array type determined by the type of the array
    array_append(arr1,expr)appends expr to arr, the resulting array type determined by the type of the first array
    array_concat(arr1,arr2)concatenates 2 arrays, the resulting array type determined by the type of the first array
    array_slice(arr,start,end)return the subarray of arr from the 0 based index start(inclusive) to end(exclusive), or null, if start is less than 0, greater than length of arr or less than end
    array_to_string(arr,str)joins all elements of arr by the delimiter specified by str
    string_to_array(str1,str2)splits str1 into an array on the delimiter specified by str2

    Apply functions

    Reduction functions operate on zero or more expressions and return a single expression. If no expressions are passed as arguments, then the result is NULL. The expressions must all be convertible to a common data type, which will be the type of the result:

    • If all arguments are NULL, the result is NULL. Otherwise, NULL arguments are ignored.
    • If the arguments comprise a mix of numbers and strings, the arguments are interpreted as strings.
    • If all arguments are integer numbers, the arguments are interpreted as longs.
    • If all arguments are numbers and at least one argument is a double, the arguments are interpreted as doubles.
    functiondescription
    greatest([expr1, …])Evaluates zero or more expressions and returns the maximum value based on comparisons as described above.
    least([expr1, …])Evaluates zero or more expressions and returns the minimum value based on comparisons as described above.

    IP address functions

    For the IPv4 address functions, the address argument can either be an IPv4 dotted-decimal string (e.g., “192.168.0.1”) or an IP address represented as a long (e.g., 3232235521). The subnet argument should be a string formatted as an IPv4 address subnet in CIDR notation (e.g., “192.168.0.0/16”).

    functiondescription
    ipv4_match(address, subnet)Returns 1 if the address belongs to the subnet literal, else 0. If address is not a valid IPv4 address, then 0 is returned. This function is more efficient if address is a long instead of a string.
    ipv4_parse(address)Parses address into an IPv4 address stored as a long. If address is a long that is a valid IPv4 address, then it is passed through. Returns null if address cannot be represented as an IPv4 address.
    ipv4_stringify(address)Converts address into an IPv4 address dotted-decimal string. If address is a string that is a valid IPv4 address, then it is passed through. Returns null if address cannot be represented as an IPv4 address.

    Vectorization Support

    supported features:

    • cast is supported for numeric and string types
    • math operators: +,-,*,/,%,^ are supported for numeric types
    • comparison operators: =, !=, , >=, <, <= are supported for numeric types
    • math functions: abs, acos, asin, atan, cbrt, ceil, cos, cosh, cot, exp, expm1, floor, getExponent, log, log10, log1p, nextUp, rint, signum, sin, sinh, sqrt, tan, tanh, toDegrees, toRadians, ulp, atan2, copySign, div, hypot, max, min, nextAfter, pow, remainder, scalb are supported for numeric types
    • time functions: timestamp_floor (with constant granularity argument) is supported for numeric types
    • other: is supported for numeric and string types