Regular Expression Functions

    • When using multi-line mode (enabled via the flag), only \n is recognized as a line terminator. Additionally, the (?d) flag is not supported and must not be used.

    • Case-insensitive matching (enabled via the (?i) flag) is always performed in a Unicode-aware manner. However, context-sensitive and local-sensitive matching is not supported. Additionally, the (?u) flag is not supported and must not be used.

    • Surrogate pairs are not supported. For example, \uD800\uDC00 is not treated as U+10000 and must be specified as \x{10000}.

    • Boundaries (\b) are incorrectly handled for a non-spacing mark without a base character.

    • \Q and \E are not supported in character classes (such as [A-Z123]) and are instead treated as literals.

    regexp_extract_all(string, pattern)

    Returns the substring(s) matched by the regular expression pattern in string:

    regexp_extract_all(string, pattern, group)

    Finds all occurrences of the regular expression pattern in string and returns the capturing group number group:

      regexp_extract(string, pattern) → varchar

      Returns the first substring matched by the regular expression pattern in string:

      1. SELECT regexp_extract('1a 2b 14m', '\d+'); -- 1

      regexp_extract(string, pattern, group) → varchar

      Finds the first occurrence of the regular expression pattern in string and returns the group:

      Evaluates the regular expression pattern and determines if it is contained within string.

      This function is similar to the LIKE operator, except that the pattern only needs to be contained within string, rather than needing to match all of . In other words, this performs a contains operation rather than a match operation. You can match the entire string by anchoring the pattern using ^ and $:

      1. SELECT regexp_like('1a 2b 14m', '\d+b'); -- true

      regexp_replace(string, pattern) → varchar

      Removes every instance of the substring matched by the regular expression pattern from string:

      1. SELECT regexp_replace('1a 2b 14m', '\d+[ab] '); -- '14m'

      regexp_replace(string, pattern, replacement) → varchar

      Replaces every instance of the substring matched by the regular expression pattern in string with replacement. Capturing groups can be referenced in replacement using $g for a numbered group or ${name} for a named group. A dollar sign ($) may be included in the replacement by escaping it with a backslash (\$):

      regexp_replace(string, pattern, function) → varchar

      Replaces every instance of the substring matched by the regular expression pattern in string using function. The function is invoked for each match with the capturing groups passed as an array. Capturing group numbers start at one; there is no group for the entire match (if you need this, surround the entire expression with parenthesis).

      1. SELECT regexp_replace('new york', '(\w)(\w*)', x -> upper(x[1]) || lower(x[2])); --'New York'

      regexp_split(string, pattern)

      1. SELECT regexp_split('1a 2b 14m', '\s*[a-z]+\s*'); -- [1, 2, 14, ]