Collations

    The API docs for collation.

    Collations are a new feature in MongoDB version 3.4. They provide a set of rules to use when comparing strings that comply with the conventions of a particular language, such as Spanish or German. If no collation is specified, the server sorts strings based on a binary comparison. Many languages have specific ordering rules, and collations allow users to build applications that adhere to language-specific comparison rules.

    In French, for example, the last accent in a given word determines the sorting order. The correct sorting order for the following four words in French is:

    Specifying a French collation allows users to sort string fields using the French sort order.

    Collations can be specified with the model or with plain Python dictionaries. The structure is the same:

    1. caseLevel=<bool>,
    2. caseFirst=<string>,
    3. numericOrdering=<bool>,
    4. alternate=<string>,
    5. maxVariable=<string>,
    6. backwards=<bool>)

    The only required parameter is locale, which the server parses as an ICU format locale ID. For example, set locale to en_US to represent US English or fr_CA to represent Canadian French.

    For a complete description of the available parameters, see the MongoDB .

    The following example demonstrates how to create a new collection called contacts and assign a default collation with the fr_CA locale. This operation ensures that all queries that are run against the contacts collection use the fr_CA collation unless another collation is explicitly specified:

    The following example shows how to create an index on the field of the contacts collection, with the unique parameter enabled and a default collation with locale set to fr_CA:

    1. from pymongo import MongoClient
    2. from pymongo.collation import Collation
    3. contacts = MongoClient().test.contacts
    4. contacts.create_index('name',
    5. collation=Collation(locale='fr_CA'))

    Individual queries can specify a collation to use when sorting results. The following example demonstrates a query that runs on the contacts collection in database test. It matches on documents that contain New York in the city field, and sorts on the field with the fr_CA collation:

    You can use collations to control document matching rules for several different types of queries. All the various update and delete methods (update_one(), , delete_one(), etc.) support collation, and you can create query filters which employ collations to comply with any of the languages and variants available to the locale parameter.

    The following example uses a collation with strength set to , which considers only the base character and character accents in string comparisons, but not case sensitivity, for example. All documents in the contacts collection with jürgen (case-insensitive) in the first_name field are updated:

    1. from pymongo import MongoClient
    2. from pymongo.collation import Collation, CollationStrength
    3. contacts = MongoClient().test.contacts
    4. result = contacts.update_many(
    5. {'first_name': 'jürgen'},
    6. {'$set': {'verified': 1}},
    7. collation=Collation(locale='de',