Collations
The API docs for collation.
Collations are a new feature in MongoDB version 3.4. They provide a set of rules to use when comparing strings that comply with the conventions of a particular language, such as Spanish or German. If no collation is specified, the server sorts strings based on a binary comparison. Many languages have specific ordering rules, and collations allow users to build applications that adhere to language-specific comparison rules.
In French, for example, the last accent in a given word determines the sorting order. The correct sorting order for the following four words in French is:
Specifying a French collation allows users to sort string fields using the French sort order.
Collations can be specified with the model or with plain Python dictionaries. The structure is the same:
caseLevel=<bool>,
caseFirst=<string>,
numericOrdering=<bool>,
alternate=<string>,
maxVariable=<string>,
backwards=<bool>)
The only required parameter is locale
, which the server parses as an ICU format locale ID. For example, set locale
to en_US
to represent US English or fr_CA
to represent Canadian French.
For a complete description of the available parameters, see the MongoDB .
The following example demonstrates how to create a new collection called contacts
and assign a default collation with the fr_CA
locale. This operation ensures that all queries that are run against the contacts
collection use the fr_CA
collation unless another collation is explicitly specified:
The following example shows how to create an index on the field of the contacts
collection, with the unique
parameter enabled and a default collation with locale
set to fr_CA
:
from pymongo import MongoClient
from pymongo.collation import Collation
contacts = MongoClient().test.contacts
contacts.create_index('name',
collation=Collation(locale='fr_CA'))
Individual queries can specify a collation to use when sorting results. The following example demonstrates a query that runs on the contacts
collection in database test
. It matches on documents that contain New York
in the city
field, and sorts on the field with the fr_CA
collation:
You can use collations to control document matching rules for several different types of queries. All the various update and delete methods (update_one(), , delete_one(), etc.) support collation, and you can create query filters which employ collations to comply with any of the languages and variants available to the locale
parameter.
The following example uses a collation with strength
set to , which considers only the base character and character accents in string comparisons, but not case sensitivity, for example. All documents in the contacts
collection with jürgen
(case-insensitive) in the first_name
field are updated:
from pymongo import MongoClient
from pymongo.collation import Collation, CollationStrength
contacts = MongoClient().test.contacts
result = contacts.update_many(
{'first_name': 'jürgen'},
{'$set': {'verified': 1}},
collation=Collation(locale='de',