Commands

    In the following example, the search command refers to an accounts index as the source, then uses fields and where commands for the conditions:

    In the below examples, we represent required arguments in angle brackets < > and optional arguments in square brackets [ ].

    Use the search command to retrieve a document from an index. You can only use the search command as the first command in the PPL query.

    1. search source=<index> [boolean-expression]

    Example 1: Get all documents

    To get all documents from the accounts index:

    1. search source=accounts;
    account_numberfirstnameaddressbalancegendercityemployerstateageemaillastname
    1Amber880 Holmes Lane39225MBroganPyramiIL32amberduke@pyrami.comDuke
    6Hattie671 Bristol Street5686MDanteNetagyTN36hattiebond@netagy.comBond
    13Nanette789 Madison Street32838FNogalQuilityVA28nullBates
    18Dale467 Hutchinson Court4180MOricknullMD33daleadams@boink.comAdams

    Example 2: Get documents that match a condition

    To get all documents from the accounts index that have either account_number equal to 1 or have gender as F:

    1. search source=accounts account_number=1 or gender=\"F\";
    account_numberfirstnameaddressbalancegendercityemployerstateageemaillastname
    1Amber880 Holmes Lane39225MBroganPyramiIL32amberduke@pyrami.comDuke
    13Nanette789 Madison Street32838FNogalQuilityVA28nullBates

    dedup

    The dedup (data deduplication) command removes duplicate documents defined by a field from the search result.

    Syntax

    1. dedup [int] <field-list> [keepempty=<bool>] [consecutive=<bool>]
    FieldDescriptionTypeRequiredDefault
    intRetain the specified number of duplicate events for each combination. The number must be greater than 0. If you do not specify a number, only the first occurring event is kept and all other duplicates are removed from the results.stringNo1
    keepemptyIf true, keep the document if any field in the field list has a null value or a field missing.nested list of objectsNoFalse
    consecutiveIf true, remove only consecutive events with duplicate combinations of values.BooleanNoFalse
    field-listSpecify a comma-delimited field list. At least one field is required.String or comma-separated list of stringsYes-

    Example 1: Dedup by one field

    To remove duplicate documents with the same gender:

    1. search source=accounts | dedup gender | fields account_number, gender;
    account_numbergender
    1M
    13F

    Example 2: Keep two duplicate documents

    To keep two duplicate documents with the same gender:

    1. search source=accounts | dedup 2 gender | fields account_number, gender;
    account_numbergender
    1M
    6M
    13F

    Example 3: Keep or ignore an empty field by default

    To keep two duplicate documents with a null field value:

    1. search source=accounts | dedup email keepempty=true | fields account_number, email;
    account_numberemail
    1amberduke@pyrami.com
    6hattiebond@netagy.com
    13null
    18daleadams@boink.com

    To remove duplicate documents with the null field value:

    1. search source=accounts | dedup email | fields account_number, email;
    account_numberemail
    1amberduke@pyrami.com
    6hattiebond@netagy.com
    18daleadams@boink.com

    Example 4: Dedup of consecutive documents

    To remove duplicates of consecutive documents:

      account_numbergender
      1M
      13F
      18M

      Limitations

      The dedup command is not rewritten to OpenSearch DSL, it is only executed on the coordination node.

      eval

      The eval command evaluates an expression and appends its result to the search result.

      Syntax

      1. eval <field>=<expression> ["," <field>=<expression> ]...
      FieldDescriptionRequired
      fieldIf a field name does not exist, a new field is added. If the field name already exists, it’s overwritten.Yes
      expressionSpecify any supported expression.Yes

      Example 1: Create a new field

      To create a new doubleAge field for each document. doubleAge is the result of age multiplied by 2:

      1. search source=accounts | eval doubleAge = age * 2 | fields age, doubleAge;
      agedoubleAge
      3264
      3672
      2856
      3366

      Example 2: Overwrite the existing field

      To overwrite the age field with age plus 1:

      1. search source=accounts | eval age = age + 1 | fields age;
      age
      33
      37
      29
      34

      Example 3: Create a new field with a field defined with the eval command

      To create a new field ddAge. ddAge is the result of doubleAge multiplied by 2, where doubleAge is defined in the eval command:

      1. search source=accounts | eval doubleAge = age * 2, ddAge = doubleAge * 2 | fields age, doubleAge, ddAge;
      agedoubleAgeddAge
      3264128
      3672144
      2856112
      3366132

      Limitation

      The eval command is not rewritten to OpenSearch DSL, it is only executed on the coordination node.

      fields

      Use the fields command to keep or remove fields from a search result.

      Syntax

      1. fields [+|-] <field-list>
      FieldDescriptionRequiredDefault
      indexPlus (+) keeps only fields specified in the field list. Minus (-) removes all fields specified in the field list.No+
      field listSpecify a comma-delimited list of fields.YesNo default

      Example 1: Select specified fields from result

      To get account_number, firstname, and lastname fields from a search result:

      1. search source=accounts | fields account_number, firstname, lastname;
      account_numberfirstnamelastname
      1AmberDuke
      6HattieBond
      13NanetteBates
      18DaleAdams

      Example 2: Remove specified fields from a search result

      To remove the account_number field from the search results:

      1. search source=accounts | fields account_number, firstname, lastname | fields - account_number;
      firstnamelastname
      AmberDuke
      HattieBond
      NanetteBates
      DaleAdams

      parse

      Use the parse command to parse a text field using regular expression and append the result to the search result.

      Syntax

      1. parse <field> <regular-expression>
      FieldDescriptionRequired
      fieldA text field.Yes
      regular-expressionThe regular expression used to extract new fields from the given test field. If a new field name exists, it will replace the original field.Yes

      The regular expression is used to match the whole text field of each document with Java regex engine. Each named capture group in the expression will become a new STRING field.

      Example 1: Create new field

      The example shows how to create new field host for each document. host will be the hostname after @ in email field. Parsing a null field will return an empty string.

      1. os> source=accounts | parse email '.+@(?<host>.+)' | fields email, host ;
      2. fetched rows / total rows = 4/4

      Example 2: Override the existing field

      1. os> source=accounts | parse address '\d+ (?<address>.+)' | fields address ;
      address
      Holmes Lane
      Bristol Street
      Madison Street
      Hutchinson Court

      Example 3: Filter and sort be casted parsed field

      The example shows how to sort street numbers that are higher than 500 in address field.

      streetNumberstreet
      671Bristol Street
      789Madison Street
      880Holmes Lane

      A few limitations exist when using the parse command:

      • Fields defined by parse cannot be parsed again. For example, source=accounts | parse address '\d+ (?<street>.+)' | parse street '\w+ (?<road>\w+)' ; will fail to return any expressions.
      • Fields defined by parse cannot be overridden with other commands. For example, when entering source=accounts | parse address '\d+ (?<street>.+)' | eval street='1' | where street='1' ; where will not match any documents since street cannot be overridden.
      • The text field used by parse cannot be overridden. For example, when entering source=accounts | parse address '\d+ (?<street>.+)' | eval address='1' ; street will not be parse since address is overridden.
      • Fields defined by parse cannot be filtered/sorted after using them in the stats command. For example, source=accounts | parse email '.+@(?<host>.+)' | stats avg(age) by host | where host=pyrami.com ; where will not parse the domain listed.

      Use the rename command to rename one or more fields in the search result.

      Syntax

      1. rename <source-field> AS <target-field>["," <source-field> AS <target-field>]...
      FieldDescriptionRequired
      source-fieldThe name of the field that you want to rename.Yes
      target-fieldThe name you want to rename to.Yes

      Example 1: Rename one field

      Rename the account_number field as an:

      1. search source=accounts | rename account_number as an | fields an;
      an
      1
      6
      13
      18

      Example 2: Rename multiple fields

      Rename the account_number field as an and employer as emp:

      1. search source=accounts | rename account_number as an, employer as emp | fields an, emp;
      anemp
      1Pyrami
      6Netagy
      13Quility
      18null

      Limitations

      The rename command is not rewritten to OpenSearch DSL, it is only executed on the coordination node.

      sort

      Use the sort command to sort search results by a specified field.

      Syntax

      1. sort [count] <[+|-] sort-field>...
      FieldDescriptionRequiredDefault
      countThe maximum number results to return from the sorted result. If count=0, all results are returned.No1000
      [+|-]Use plus [+] to sort by ascending order and minus [-] to sort by descending order.NoAscending order
      sort-fieldSpecify the field that you want to sort by.Yes-

      Example 1: Sort by one field

      To sort all documents by the age field in ascending order:

      1. search source=accounts | sort age | fields account_number, age;
      account_numberage
      1328
      132
      1833
      636

      Example 2: Sort by one field and return all results

      To sort all documents by the age field in ascending order and specify count as 0 to get back all results:

      1. search source=accounts | sort 0 age | fields account_number, age;
      account_numberage
      1328
      132
      1833
      636

      Example 3: Sort by one field in descending order

      To sort all documents by the age field in descending order:

      1. search source=accounts | sort - age | fields account_number, age;
      account_numberage
      636
      1833
      132
      1328

      Example 4: Specify the number of sorted documents to return

      To sort all documents by the age field in ascending order and specify count as 2 to get back two results:

      1. search source=accounts | sort 2 age | fields account_number, age;
      account_numberage
      1328
      132

      Example 5: Sort by multiple fields

      To sort all documents by the gender field in ascending order and age field in descending order:

      1. search source=accounts | sort + gender, - age | fields account_number, gender, age;
      account_numbergenderage
      13F28
      6M36
      18M33
      1M32

      stats

      Use the stats command to aggregate from search results.

      The following table lists the aggregation functions and also indicates how each one handles null or missing values:

      FunctionNULLMISSING
      COUNTNot countedNot counted
      SUMIgnoreIgnore
      AVGIgnoreIgnore
      MAXIgnoreIgnore
      MINIgnoreIgnore

      Syntax

      1. stats <aggregation>... [by-clause]...
      FieldDescriptionRequiredDefault
      aggregationSpecify a statistical aggregation function. The argument of this function must be a field.Yes1000
      by-clauseSpecify one or more fields to group the results by. If not specified, the stats command returns only one row, which is the aggregation over the entire result set.No-

      Example 1: Calculate the average value of a field

      To calculate the average age of all documents:

      1. search source=accounts | stats avg(age);
      avg(age)
      32.25

      Example 2: Calculate the average value of a field by group

      To calculate the average age grouped by gender:

      1. search source=accounts | stats avg(age) by gender;
      genderavg(age)
      F28.0
      M33.666666666666664

      Example 3: Calculate the average and sum of a field by group

      To calculate the average and sum of age grouped by gender:

      1. search source=accounts | stats avg(age), sum(age) by gender;
      genderavg(age)sum(age)
      F2828
      M33.666666666666664101

      Example 4: Calculate the maximum value of a field

      To calculate the maximum age:

      1. search source=accounts | stats max(age);

      Example 5: Calculate the maximum and minimum value of a field by group

      To calculate the maximum and minimum age values grouped by gender:

      1. search source=accounts | stats max(age), min(age) by gender;
      gendermin(age)max(age)
      F2828
      M3236

      where

      Use the where command with a bool expression to filter the search result. The where command only returns the result when the bool expression evaluates to true.

      Syntax

      1. where <boolean-expression>
      FieldDescriptionRequired
      bool-expressionAn expression that evaluates to a boolean value.No

      Example 1: Filter result set with a condition

      To get all documents from the accounts index where account_number is 1 or gender is F:

      1. search source=accounts | where account_number=1 or gender=\"F\" | fields account_number, gender;
      account_numbergender
      1M
      13F

      head

      Use the head command to return the first N number of results in a specified search order.

      Syntax

      1. head [N]
      FieldDescriptionRequiredDefault
      NSpecify the number of results to return.No10

      To get the first 10 results:

      1. search source=accounts | fields firstname, age | head;
      firstnameage
      Amber32
      Hattie36
      Nanette28

      Example 2: Get the first N results

      To get the first two results:

      firstnameage
      Amber32
      Hattie36

      The head command is not rewritten to OpenSearch DSL, it is only executed on the coordination node.

      Use the rare command to find the least common values of all fields in a field list. A maximum of 10 results are returned for each distinct set of values of the group-by fields.

      Syntax

      1. rare <field-list> [by-clause]
      FieldDescriptionRequired
      field-listSpecify a comma-delimited list of field names.No
      by-clauseSpecify one or more fields to group the results by.No

      Example 1: Find the least common values in a field

      To find the least common values of gender:

      1. search source=accounts | rare gender;
      gender
      F
      M

      Example 2: Find the least common values grouped by gender

      To find the least common age grouped by gender:

      1. search source=accounts | rare age by gender;
      genderage
      F28
      M32
      M33

      Limitations

      The rare command is not rewritten to OpenSearch DSL, it is only executed on the coordination node.

      top

      Use the top command to find the most common values of all fields in the field list.

      Syntax

      1. top [N] <field-list> [by-clause]
      FieldDescriptionDefault
      Specify the number of results to return.10
      field-listSpecify a comma-delimited list of field names.-
      by-clauseSpecify one or more fields to group the results by.-

      Example 1: Find the most common values in a field

      To find the most common genders:

      1. search source=accounts | top gender;
      gender
      M
      F

      Example 2: Find the most common value in a field

      To find the most common gender:

      1. search source=accounts | top 1 gender;
      gender
      M

      Example 2: Find the most common values grouped by gender

      To find the most common age grouped by gender:

      1. search source=accounts | top 1 age by gender;
      genderage
      F28
      M32

      Limitations

      The top command is not rewritten to OpenSearch DSL, it is only executed on the coordination node.

      match

      Use the match command to search documents that match a string, number, date, or boolean value for a given field.

      Syntax

      1. match(field_expression, query_expression[, option=<option_value>]*)

      You can specify the following options:

      • analyzer
      • auto_generate_synonyms_phrase
      • fuzziness
      • prefix_length
      • fuzzy_transpositions
      • fuzzy_rewrite
      • lenient
      • operator
      • minimum_should_match
      • zero_terms_query
      • boost

      Example 1: Search the message field:

      1. GET my_index/_search
      2. {
      3. "query": {
      4. "match": {
      5. "message": "this is a test"
      6. }
      7. }
      8. }

      PPL query:

      1. search source=my_index | match field=message query="this is a test"

      Example 2: Search the message field with the operator parameter:

      1. GET my_index/_search
      2. {
      3. "query": {
      4. "match": {
      5. "message": {
      6. "query": "this is a test",
      7. "operator": "and"
      8. }
      9. }
      10. }
      11. }

      PPL query:

      1. search source=my_index | match field=message query="this is a test" operator=and

      Example 3: Search the message field with the operator and zero_terms_query parameters:

      1. GET my_index/_search
      2. {
      3. "query": {
      4. "match": {
      5. "message": {
      6. "query": "to be or not to be",
      7. "operator": "and",
      8. "zero_terms_query": "all"
      9. }
      10. }
      11. }
      12. }

      PPL query:

      1. search source=my_index | where match(message, "this is a test", operator=and, zero_terms_query=all)

      ad

      The ad command applies the Random Cut Forest (RCF) algorithm in the ML Commons plugin on the search result returned by a PPL command. Based on the input, the plugin uses two types of RCF algorithms: fixed in time RCF for processing time-series data and batch RCF for processing non-time-series data.

      Fixed In Time RCF For Time-series Data Command Syntax

      1. ad <shingle_size> <time_decay> <time_field>
      FieldDescriptionRequired
      shingle_sizeA consecutive sequence of the most recent records. The default value is 8.No
      time_decaySpecifies how much of the recent past to consider when computing an anomaly score. The default value is 0.001.No
      time_fieldSpecifies the time filed for RCF to use as time-series data. Must be either a long value, such as the timestamp in miliseconds, or a string value in “yyyy-MM-dd HH:mm:ss”.Yes
      1. ad <shingle_size> <time_decay>
      FieldDescriptionRequired
      shingle_sizeA consecutive sequence of the most recent records. The default value is 8.No
      time_decaySpecifies how much of the recent past to consider when computing an anomaly score. The default value is 0.001.No

      Example 1: Detecting events in New York City from taxi ridership data with time-series data

      The example trains a RCF model and use the model to detect anomalies in the time-series ridership data.

      PPL query:

      1. os> source=nyc_taxi | fields value, timestamp | AD time_field='timestamp' | where value=10844.0
      valuetimestampscoreanomaly_grade
      10844.014041728000000.00.0

      Example 2: Detecting events in New York City from taxi ridership data with non-time-series data

      PPL query:

      1. os> source=nyc_taxi | fields value | AD | where value=10844.0

      kmeans

      The kmeans command applies the ML Commons plugin’s kmeans algorithm to the provided PPL command’s search results.

        For cluster-number, enter the number of clusters you want to group your data points into.

        Example

        The example shows how to classify three Iris species (Iris setosa, Iris virginica and Iris versicolor) based on the combination of four features measured from each sample: the length and the width of the sepals and petals.

        sepal_length_in_cmsepal_width_in_cmpetal_length_in_cmpetal_width_in_cmClusterID 
         5.13.51.40.21
         5.63.04.11.30
         6.72.55.81.82