Working with JSON Data

    Greenplum Database supports JSON as specified in the RFC 7159 document and enforces data validity according to the JSON rules. There are also JSON-specific functions and operators available for json data. See .

    This section contains the following topics:

    Parent topic:

    When Greenplum Database stores data as json data type, an exact copy of the input text is stored and the JSON processing functions reparse the data on each execution.

    • Semantically-insignificant white space between tokens is retained, as well as the order of keys within JSON objects.
    • All key/value pairs are kept even if a JSON object contains duplicate keys. For duplicate keys, JSON processing functions consider the last value as the operative one.

    Greenplum Database allows only one character set encoding per database. It is not possible for the json type to conform rigidly to the JSON specification unless the database encoding is UTF8. Attempts to include characters that cannot be represented in the database encoding will fail. Characters that can be represented in the database encoding but not in UTF8 are allowed.

    The RFC 7159 document permits JSON strings to contain Unicode escape sequences denoted by \uXXXX. For the json type, the Greenplum Database input function allows Unicode escapes regardless of the database encoding and checks Unicode escapes only for syntactic correctness (a \u followed by four hex digits).

    The input and output syntax for the json data type is as specified in RFC 7159.

    The following are all valid json expressions:

    Representing data as JSON can be considerably more flexible than the traditional relational data model, which is compelling in environments where requirements are fluid. It is quite possible for both approaches to co-exist and complement each other within the same application. However, even for applications where maximal flexibility is desired, it is still recommended that JSON documents have a somewhat fixed structure. The structure is typically unenforced (though enforcing some business rules declaratively is possible), but having a predictable structure makes it easier to write queries that usefully summarize a set of “documents” (datums) in a table.

    JSON data is subject to the same concurrency-control considerations as any other data type when stored in a table. Although storing large documents is practicable, keep in mind that any update acquires a row-level lock on the whole row. Consider limiting JSON documents to a manageable size in order to decrease lock contention among updating transactions. Ideally, JSON documents should each represent an atomic datum that business rules dictate cannot reasonably be further subdivided into smaller datums that could be modified independently.

    Built-in functions and operators that create and manipulate JSON data.

    Note: For json values, all key/value pairs are kept even if a JSON object contains duplicate keys. For duplicate keys, JSON processing functions consider the last value as the operative one.

    This table describes the functions that create json values.

    This table describes the functions that process json values.

    Note: Many of these functions and operators convert Unicode escapes in JSON strings to regular characters. The functions throw an error for characters that cannot be represented in the database encoding.

    For json_populate_record and json_populate_recordset, type coercion from JSON is best effort and might not result in desired values for some types. JSON keys are matched to identical column names in the target row type. JSON fields that do not appear in the target row type are omitted from the output, and target columns that do not match any JSON field return .