MongoDB Input

    For additional information about MongoDB, see the MongoDB documentation.

    Transform name : Specify the unique name of the MongoDB Input transform in the pipeline. Preview button: Display the rows generated by this transform. Enter the maximum number of records that you want to preview, then click OK. The preview data appears in the Examine preview data window.

    Configure Connection tab

    mongodb input screenshot input tab

    The Input options tab enables you to specify which database and collection you want to retrieve information from. You can also indicate the read preferences and tag sets in this tab.

    Enter the following information in the Input options fields:

    OptionDefinition

    Database

    Name of the database to retrieve data from. Click Get DBs to populate the drop-down menu with a list of databases on the server.

    Collection

    Name of the collection to retrieve data from. Click Get collections to populate the drop-down menu with a list of collections within the database.

    Read preference

    Specify which node to read first: Primary, Primary preferred, Secondary, Secondary preferred, or Nearest.

    Tag set specification table

    Tags allow you to customize write concerns and read preferences for a replica set. The Tag set specification table allows you to specify criteria for selecting replica set members. See Tag Sets for more information.

    Enter the following information in the Tag Set fields:

    Query tab

    The Query tab enables you to refine read requests. This tab operates in two different query modes:

    • Aggregation pipeline specification mode.

    The Query is aggregation pipeline option toggles between these two modes. The Query expression uses MongoDB’s JSON-like query language with query operators to perform query operations. The Aggregation pipeline specification field uses MongoDB’s to transform and combine documents in a collection. An aggregation pipeline connects several pipeline expressions together, with the output of the previous expression becoming the input for the next.

    Enter the following information in the Query fields:

    Fields/OptionDefinition

    Query expression (JSON)

    Enter a query expression in this field to limit the output.

    Aggregation pipeline specification (JSON)

    Select the Query is aggregation pipeline option to display the Aggregation pipeline specification (JSON) field. Then enter a pipeline expression to perform aggregations or selections. The method name, including the collection name of the database you selected in the Input Options tab, appears after the label for this field.

    Query is aggregation pipeline

    Select this option to use the aggregation pipeline framework.

    Execute for each row

    Select this option to perform the query on each row of data.

    Fields expression (JSON)

    Enter an argument to control the projection (fields to return) from a query. If empty, all fields are returned. This field is only available for query expressions.

    mongodb input screenshot fields tab

    Use the Fields tab to define properties for exported fields. The Fields tab operates in two different modes:

    1. including all fields in a single JSON field

    2. including selected fields in the output.

    If you store the output in a single JSON field, you can parse this JSON using the JSON Input transform, or by using a User Defined Java Class transform.

    Note: All fields in the Fields tab except the Name of JSON output field are inactive when the Output single JSON field is selected. When the Output single JSON field is not selected, the Name of JSON output field is inactive.

    General options:

    • The Get fields button: Click it to generate a sample set of documents. You can edit the list of field names, paths, and data type for each field in the sample.

    • Output single JSON field: Specify that the query results in a single JSON field with the String data type (default).

    Enter the following information in the table if you want to output distinct fields:

    The following sections contain examples of query expressions and aggregate pipelines.

    Query expression

    MongoDB allows you to select and filter documents in a collection using specific fields and values. The MongoDB Extended JSON documentation details how to use queries. Apache Hop supports only the features discussed on this page.

    The following table displays some examples of the syntax and structure of the queries you can use to request data from MongoDB:

    Query expressionDescription

    Queries all values where the name field has a value equal to MongoDB.

    { name : { ‘$regex’ : “m.*”, ‘$options’ : “i” } }

    Uses a regular expression to find name fields starting with m, case insensitive.

    { name : { ‘$gt’ : “M” } }

    Searches all strings greater than M.

    { name : { ‘$lte’ : “T” } }

    Searches all strings less than or equal to T.

    Finds all names that are either MongoDB or MySQL (Reference).

    { name : { ‘$nin’ : [ “MongoDB”, “MySQL” ] } }

    Finds all names that are not either MongoDB or MySQL, or where the field is not set .

    { created_at : { $gte : { $date : “2014-12-31T00:00:00.000Z” } } }

    Finds all created_at documents that are greater than or equal to the specified UTC date.

    { $where : “this.count == 1” }

    Uses JavaScript to evaluate a condition.

    Returns all documents in the collection named collection sorted by the age field in descending order.

    MongoDB allows you to select and filter documents using the pipeline framework. The Aggregation page in the MongoDB documentation provides additional examples of function calls.

    The following table displays some examples of the query syntax and structure you can use to request data from MongoDB: