More advanced features of the high-level .NET client (OpenSearch.Client)

copy

OpenSearch uses dynamic mapping to infer field types of the documents that are indexed. However, to have more control over the schema of your document, you can pass an explicit mapping to OpenSearch. You can define data types for some or all fields of your document in this mapping.

Similarly, OpenSearch.Client uses auto mapping to infer field data types based on the types of the class’s properties. To use auto mapping, create a index using the AutoMap’s default constructor:

var createResponse = await osClient.Indices.CreateAsync("students",
    c => c.Map(m => m.AutoMap<Student>()));

copy

If you use auto mapping, Id and GradYear are mapped as integers, Gpa is mapped as a double, and FirstName and LastName are mapped as text with a keyword subfield. If you want to search for FirstName and LastName and allow only case-sensitive full matches, you can suppress analyzing by mapping these fields as keyword only. In Query DSL, you can accomplish this using the following query:

PUT students
{
  "mappings" : {
    "properties" : {
      "firstName" : {
        "type" : "keyword"
      },
      "lastName" : {
        "type" : "keyword"
      }
    }
  }
}

In OpenSearch.Client, you can use fluid lambda syntax to mark these fields as keywords:

copy

Settings

In addition to mappings, you can specify settings like the number of primary and replica shards when creating an index. The following query sets the number of primary shards to 1 and the number of replica shards to 2:

PUT students
{
  "mappings" : {
    "properties" : {
      "firstName" : {
        "type" : "keyword"
      },
      "lastName" : {
        "type" : "keyword"
      }
    }
  }, 
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 2
  }
}

var createResponse = await osClient.Indices.CreateAsync(index,
                            c => c.Map(m => m.AutoMap<Student>()
                            .Properties<Student>(p => p
                            .Keyword(k => k.Name(f => f.FirstName))
                            .Keyword(k => k.Name(f => f.LastName))))
                            .Settings(s => s.NumberOfShards(1).NumberOfReplicas(2)));

copy

In addition to indexing one document using Index and IndexDocument and indexing multiple documents using IndexMany, you can gain more control over document indexing by using Bulk or BulkAll. Indexing documents individually is inefficient because it creates an HTTP request for every document sent. The BulkAll helper frees you from handling retry, chunking or back off request functionality. It automatically retries if the request fails, backs off if the server is down, and controls how many documents are sent in one HTTP request.

In the following example, BulkAll is configured with the index name, number of back off retries, and back off time. Additionally, the maximum degrees of parallelism setting controls the number of parallel HTTP requests containing the data. Finally, the size parameter signals how many documents are sent in one HTTP request.

We recommend setting the size to 100–1000 documents in production.

BulkAll takes a stream of data and returns an Observable that you can use to observe the background operation.

copy

Searching with Boolean query

OpenSearch.Client exposes full OpenSearch query capability. In addition to simple searches that use the match query, you can create a more complex Boolean query to search for students who graduated in 2022 and sort them by last name. In the example below, search is limited to 10 documents, and the scroll API is used to control the pagination of results.

var gradResponse = await osClient.SearchAsync<Student>(s => s
                        .Index(index)
                        .From(0)
                        .Size(10)
                        .Scroll("1m")
                        .Query(q => q
                        .Filter(f => f
                        .Term(t => t.Field(fld => fld.GradYear).Value(2022)))))
                        .Sort(srt => srt.Ascending(f => f.LastName)));

copy

OpenSearch.Client includes the full OpenSearch query functionality, including aggregations. In addition to grouping search results into buckets (for example, grouping students by GPA ranges), you can calculate metrics like sum or average. The following query calculates the average GPA of all students in the index.

Setting Size to 0 means OpenSearch will only return the aggregation, not the actual documents.

var aggResponse = await osClient.SearchAsync<Student>(s => s
                                .Index(index)
                                .Size(0)
                                .Aggregations(a => a
                                .Average("average gpa", 
                                            avg => avg.Field(fld => fld.Gpa))));

copy

Sample program for creating an index and indexing data

The following program creates an index, reads a stream of student records from a comma-separated file and indexes this data into OpenSearch.

copy

The following program searches students by name and graduation date and calculates the average GPA.

using OpenSearch.Client;
namespace NetClientProgram;
internal class Program
{
    private const string index = "students";
    public static IOpenSearchClient osClient = new OpenSearchClient();
    public static async Task Main(string[] args)
    {
        await SearchByName();
        await SearchByGradDate();
        await CalculateAverageGpa();
    }
    private static async Task SearchByName()
    {
        Console.WriteLine("Searching for name......");
        var nameResponse = await osClient.SearchAsync<Student>(s => s
                                .Index(index)
                                .Query(q => q
                                .Match(m => m
                                .Field(fld => fld.FirstName)
                                .Query("Zhang"))));
        if (!nameResponse.IsValid)
        {
            throw new Exception("Aggregation query response is not valid.");
        }
        foreach (var s in nameResponse.Documents)
        {
            Console.WriteLine($"{s.Id} {s.LastName} " +
                $"{s.FirstName} {s.Gpa} {s.GradYear}");
        }
    }
    {
        Console.WriteLine("Searching for grad date......");
        // Search for all students who graduated in 2022
        var gradResponse = await osClient.SearchAsync<Student>(s => s
                                .Index(index)
                                .From(0)
                                .Size(2)
                                .Scroll("1m")
                                .Query(q => q
                                .Bool(b => b
                                .Filter(f => f
                                .Term(t => t.Field(fld => fld.GradYear).Value(2022)))))
                                .Sort(srt => srt.Ascending(f => f.LastName))
                                .Size(10));
        if (!gradResponse.IsValid)
        {
            throw new Exception("Grad date query response is not valid.");
        }
        while (gradResponse.Documents.Any())
        {
            foreach (var data in gradResponse.Documents)
            {
                Console.WriteLine($"{data.Id} {data.LastName} {data.FirstName} " +
                    $"{data.Gpa} {data.GradYear}");
            }
            gradResponse = osClient.Scroll<Student>("1m", gradResponse.ScrollId);
        }
    }
    public static async Task CalculateAverageGpa()
    {
        Console.WriteLine("Calculating average GPA......");
        // Search and aggregate
        // Size 0 means documents are not returned, only aggregation is returned
        var aggResponse = await osClient.SearchAsync<Student>(s => s
                                .Index(index)
                                .Size(0)
                                .Aggregations(a => a
                                .Average("average gpa", 
                                            avg => avg.Field(fld => fld.Gpa))));
        if (!aggResponse.IsValid) throw new Exception("Aggregation response not valid");
        var avg = aggResponse.Aggregations.Average("average gpa").Value;
        Console.WriteLine($"Average GPA is {avg}");
    }

copy