Configuring the log store

    You can make modifications to your log store, including:

    • storage for your Elasticsearch cluster

    • shard replication across data nodes in the cluster, from full replication to no replication

    • external access to Elasticsearch data

    Elasticsearch is a memory-intensive application. Each Elasticsearch node needs at least 16G of memory for both memory requests and limits, unless you specify otherwise in the custom resource. The initial set of OKD nodes might not be large enough to support the Elasticsearch cluster. You must add additional nodes to the OKD cluster to run with the recommended or higher memory, up to a maximum of 64G for each Elasticsearch node.

    Each Elasticsearch node can operate with a lower memory setting, though this is not recommended for production environments.

    By default, OpenShift Logging does not store audit logs in the internal OKD Elasticsearch log store. You can send audit logs to this log store so, for example, you can view them in Kibana.

    To send the audit logs to the default internal Elasticsearch log store, for example to view the audit logs in Kibana, you must use the Log Forwarding API.

    Procedure

    To use the Log Forward API to forward audit logs to the internal Elasticsearch instance:

    1. Create or edit a YAML file that defines the ClusterLogForwarder CR object:

      • Create a CR to send all log types to the internal Elasticsearch instance. You can use the following example without making any changes:

        1A pipeline defines the type of logs to forward using the specified output. The default output forwards logs to the internal Elasticsearch instance.

        You must specify all three types of logs in the pipeline: application, infrastructure, and audit. If you do not specify a log type, those logs are not stored and will be lost.

      • If you have an existing ClusterLogForwarder CR, add a pipeline to the default output for the audit logs. You do not need to define the default output. For example:

        1. apiVersion: "logging.openshift.io/v1"
        2. kind: ClusterLogForwarder
        3. metadata:
        4. name: instance
        5. namespace: openshift-logging
        6. spec:
        7. outputs:
        8. - name: elasticsearch-insecure
        9. type: "elasticsearch"
        10. url: http://elasticsearch-insecure.messaging.svc.cluster.local
        11. insecure: true
        12. - name: elasticsearch-secure
        13. type: "elasticsearch"
        14. url: https://elasticsearch-secure.messaging.svc.cluster.local
        15. secret:
        16. name: es-audit
        17. - name: secureforward-offcluster
        18. type: "fluentdForward"
        19. url: https://secureforward.offcluster.com:24224
        20. secret:
        21. name: secureforward
        22. pipelines:
        23. - name: container-logs
        24. inputRefs:
        25. - application
        26. outputRefs:
        27. - secureforward-offcluster
        28. - name: infra-logs
        29. inputRefs:
        30. - infrastructure
        31. outputRefs:
        32. - elasticsearch-insecure
        33. - name: audit-logs
        34. inputRefs:
        35. - audit
        36. outputRefs:
        37. - elasticsearch-secure
        38. - default (1)
        1This pipeline sends the audit logs to the internal Elasticsearch instance in addition to an external instance.

    Additional resources

    For more information on the Log Forwarding API, see Forwarding logs using the Log Forwarding API.

    Configuring log retention time

    You can configure a retention policy that specifies how long the default Elasticsearch log store keeps indices for each of the three log sources: infrastructure logs, application logs, and audit logs.

    To configure the retention policy, you set a maxAge parameter for each log source in the ClusterLogging custom resource (CR). The CR applies these values to the Elasticsearch rollover schedule, which determines when Elasticsearch deletes the rolled-over indices.

    Elasticsearch rolls over an index, moving the current index and creating a new index, when an index matches any of the following conditions:

    • The index is older than the rollover.maxAge value in the Elasticsearch CR.

    • The index size is greater than 40 GB × the number of primary shards.

    • The index doc count is greater than 40960 KB × the number of primary shards.

    Elasticsearch deletes the rolled-over indices based on the retention policy you configure. If you do not create a retention policy for any log sources, logs are deleted after seven days by default.

    Prerequisites

    • OpenShift Logging and the OpenShift Elasticsearch Operator must be installed.

    Procedure

    To configure the log retention time:

    1. Edit the ClusterLogging CR to add or modify the retentionPolicy parameter:

      1. apiVersion: "logging.openshift.io/v1"
      2. kind: "ClusterLogging"
      3. ...
      4. spec:
      5. managementState: "Managed"
      6. logStore:
      7. type: "elasticsearch"
      8. retentionPolicy: (1)
      9. application:
      10. maxAge: 1d
      11. infra:
      12. maxAge: 7d
      13. audit:
      14. maxAge: 7d
      15. elasticsearch:
      16. nodeCount: 3
      17. ...
      1Specify the time that Elasticsearch should retain each log source. Enter an integer and a time designation: weeks(w), hours(h/H), minutes(m) and seconds(s). For example, 1d for one day. Logs older than the maxAge are deleted. By default, logs are retained for seven days.
    2. You can verify the settings in the Elasticsearch custom resource (CR).

      For example, the Red Hat OpenShift Logging Operator updated the following Elasticsearch CR to configure a retention policy that includes settings to roll over active indices for the infrastructure logs every eight hours and the rolled-over indices are deleted seven days after rollover. OKD checks every 15 minutes to determine if the indices need to be rolled over.

      1. apiVersion: "logging.openshift.io/v1"
      2. kind: "Elasticsearch"
      3. metadata:
      4. name: "elasticsearch"
      5. spec:
      6. ...
      7. indexManagement:
      8. policies: (1)
      9. - name: infra-policy
      10. phases:
      11. delete:
      12. minAge: 7d (2)
      13. hot:
      14. actions:
      15. rollover:
      16. maxAge: 8h (3)
      17. pollInterval: 15m (4)
      18. ...
      1For each log source, the retention policy indicates when to delete and roll over logs for that source.
      2When OKD deletes the rolled-over indices. This setting is the maxAge you set in the ClusterLogging CR.
      3The index age for OKD to consider when rolling over the indices. This value is determined from the maxAge you set in the ClusterLogging CR.
      4When OKD checks if the indices should be rolled over. This setting is the default and cannot be changed.

      The OpenShift Elasticsearch Operator deploys a cron job to roll over indices for each mapping using the defined policy, scheduled using the pollInterval.

      1. $ oc get cronjob

      Example output

      1. NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
      2. elasticsearch-im-audit */15 * * * * False 0 <none> 4s
      3. elasticsearch-im-infra */15 * * * * False 0 <none> 4s

    Configuring CPU and memory requests for the log store

    Each component specification allows for adjustments to both the CPU and memory requests. You should not have to manually adjust these values as the OpenShift Elasticsearch Operator sets values sufficient for your environment.

    In large-scale clusters, the default memory limit for the Elasticsearch proxy container might not be sufficient, causing the proxy container to be OOMKilled. If you experience this issue, increase the memory requests and limits for the Elasticsearch proxy.

    Each Elasticsearch node can operate with a lower memory setting though this is not recommended for production deployments. For production use, you should have no less than the default 16Gi allocated to each pod. Preferably you should allocate as much as possible, up to 64Gi per pod.

    Prerequisites

    Procedure

    1. Edit the ClusterLogging custom resource (CR) in the openshift-logging project:

      1. $ oc edit ClusterLogging instance
      1. apiVersion: "logging.openshift.io/v1"
      2. kind: "ClusterLogging"
      3. metadata:
      4. name: "instance"
      5. ....
      6. spec:
      7. logStore:
      8. type: "elasticsearch"
      9. elasticsearch:
      10. resources: (1)
      11. limits:
      12. memory: 16Gi
      13. requests:
      14. cpu: "1"
      15. memory: "64Gi"
      16. proxy: (2)
      17. resources:
      18. limits:
      19. memory: 100Mi
      20. requests:
      21. memory: 100Mi
      1Specify the CPU and memory requests for Elasticsearch as needed. If you leave these values blank, the OpenShift Elasticsearch Operator sets default values that should be sufficient for most deployments. The default values are 16Gi for the memory request and 1 for the CPU request.
      2Specify the CPU and memory requests for the Elasticsearch proxy as needed. If you leave these values blank, the OpenShift Elasticsearch Operator sets default values that should be sufficient for most deployments. The default values are 256Mi for the memory request and 100m for the CPU request.

    If you adjust the amount of Elasticsearch memory, you must change both the request value and the limit value.

    For example:

    1. resources:
    2. limits:
    3. memory: "32Gi"
    4. requests:
    5. cpu: "8"
    6. memory: "32Gi"

    You can define how Elasticsearch shards are replicated across data nodes in the cluster.

    Prerequisites

    • OpenShift Logging and Elasticsearch must be installed.

    Procedure

    1. Edit the ClusterLogging custom resource (CR) in the openshift-logging project:

      1. $ oc edit clusterlogging instance
      1. apiVersion: "logging.openshift.io/v1"
      2. kind: "ClusterLogging"
      3. metadata:
      4. name: "instance"
      5. ....
      6. spec:
      7. logStore:
      8. type: "elasticsearch"
      9. elasticsearch:
      10. redundancyPolicy: "SingleRedundancy" (1)
      1Specify a redundancy policy for the shards. The change is applied upon saving the changes.
      • FullRedundancy. Elasticsearch fully replicates the primary shards for each index to every data node. This provides the highest safety, but at the cost of the highest amount of disk required and the poorest performance.

      • MultipleRedundancy. Elasticsearch fully replicates the primary shards for each index to half of the data nodes. This provides a good tradeoff between safety and performance.

      • SingleRedundancy. Elasticsearch makes one copy of the primary shards for each index. Logs are always available and recoverable as long as at least two data nodes exist. Better performance than MultipleRedundancy, when using 5 or more nodes. You cannot apply this policy on deployments of single Elasticsearch node.

      • ZeroRedundancy. Elasticsearch does not make copies of the primary shards. Logs might be unavailable or lost in the event a node is down or fails. Use this mode when you are more concerned with performance than safety, or have implemented your own disk/PVC backup/restore strategy.

    The number of primary shards for the index templates is equal to the number of Elasticsearch data nodes.

    Scaling down Elasticsearch pods

    Reducing the number of Elasticsearch pods in your cluster can result in data loss or Elasticsearch performance degradation.

    If you scale down, you should scale down by one pod at a time and allow the cluster to re-balance the shards and replicas. After the Elasticsearch health status returns to green, you can scale down by another pod.

    If your Elasticsearch cluster is set to ZeroRedundancy, you should not scale down your Elasticsearch pods.

    Configuring persistent storage for the log store

    Elasticsearch requires persistent storage. The faster the storage, the faster the Elasticsearch performance.

    Prerequisites

    • OpenShift Logging and Elasticsearch must be installed.

    Procedure

    1. Edit the ClusterLogging CR to specify that each data node in the cluster is bound to a Persistent Volume Claim.

      1. apiVersion: "logging.openshift.io/v1"
      2. kind: "ClusterLogging"
      3. metadata:
      4. name: "instance"
      5. # ...
      6. spec:
      7. logStore:
      8. type: "elasticsearch"
      9. elasticsearch:
      10. nodeCount: 3
      11. storage:
      12. storageClassName: "gp2"
      13. size: "200G"

    This example specifies each data node in the cluster is bound to a Persistent Volume Claim that requests “200G” of AWS General Purpose SSD (gp2) storage.

    If you use a local volume for persistent storage, do not use a raw block volume, which is described with volumeMode: block in the LocalVolume object. Elasticsearch cannot use raw block volumes.

    You can use emptyDir with your log store, which creates an ephemeral deployment in which all of a pod’s data is lost upon restart.

    When using emptyDir, if log storage is restarted or redeployed, you will lose data.

    Prerequisites

    • OpenShift Logging and Elasticsearch must be installed.

    Procedure

    1. Edit the ClusterLogging CR to specify emptyDir:

      1. spec:
      2. logStore:
      3. type: "elasticsearch"
      4. elasticsearch:
      5. nodeCount: 3
      6. storage: {}

    Performing an Elasticsearch rolling cluster restart

    Perform a rolling restart when you change the elasticsearch config map or any of the elasticsearch-* deployment configurations.

    Also, a rolling restart is recommended if the nodes on which an Elasticsearch pod runs requires a reboot.

    Prerequisites

    • OpenShift Logging and Elasticsearch must be installed.

    • Install the OKD es_util tool

    Procedure

    To perform a rolling cluster restart:

    1. Change to the openshift-logging project:

      1. $ oc project openshift-logging
    2. Get the names of the Elasticsearch pods:

      1. $ oc get pods | grep elasticsearch-
    3. Scale down the Fluentd pods so they stop sending new logs to Elasticsearch:

      1. $ oc -n openshift-logging patch daemonset/logging-fluentd -p '{"spec":{"template":{"spec":{"nodeSelector":{"logging-infra-fluentd": "false"}}}}}'
    4. Perform a shard synced flush using the OKD tool to ensure there are no pending operations waiting to be written to disk prior to shutting down:

      1. $ oc exec <any_es_pod_in_the_cluster> -c elasticsearch -- es_util --query="_flush/synced" -XPOST

      For example:

      1. $ oc exec -c elasticsearch-cdm-5ceex6ts-1-dcd6c4c7c-jpw6 -c elasticsearch -- es_util --query="_flush/synced" -XPOST

      Example output

    5. Prevent shard balancing when purposely bringing down nodes using the OKD es_util tool:

      1. $ oc exec <any_es_pod_in_the_cluster> -c elasticsearch -- es_util --query="_cluster/settings" -XPUT -d '{ "persistent": { "cluster.routing.allocation.enable" : "primaries" } }'

      For example:

      1. $ oc exec elasticsearch-cdm-5ceex6ts-1-dcd6c4c7c-jpw6 -c elasticsearch -- es_util --query="_cluster/settings" -XPUT -d '{ "persistent": { "cluster.routing.allocation.enable" : "primaries" } }'

      Example output

      1. {"acknowledged":true,"persistent":{"cluster":{"routing":{"allocation":{"enable":"primaries"}}}},"transient":
    6. After the command is complete, for each deployment you have for an ES cluster:

        1. $ oc rollout resume deployment/<deployment-name>

        For example:

        1. $ oc rollout resume deployment/elasticsearch-cdm-0-1

        Example output

        1. deployment.extensions/elasticsearch-cdm-0-1 resumed

        A new pod is deployed. After the pod has a ready container, you can move on to the next deployment.

        1. $ oc get pods | grep elasticsearch-

        Example output

        1. NAME READY STATUS RESTARTS AGE
        2. elasticsearch-cdm-5ceex6ts-2-f799564cb-l9mj7 2/2 Running 0 22h
        3. elasticsearch-cdm-5ceex6ts-3-585968dc68-k7kjr 2/2 Running 0 22h
      1. After the deployments are complete, reset the pod to disallow rollouts:

        1. $ oc rollout pause deployment/<deployment-name>

        For example:

        1. $ oc rollout pause deployment/elasticsearch-cdm-0-1

        Example output

        1. Check that the Elasticsearch cluster is in a green or yellow state:

          1. $ oc exec <any_es_pod_in_the_cluster> -c elasticsearch -- es_util --query=_cluster/health?pretty=true

          If you performed a rollout on the Elasticsearch pod you used in the previous commands, the pod no longer exists and you need a new pod name here.

          For example:

          1. $ oc exec elasticsearch-cdm-5ceex6ts-1-dcd6c4c7c-jpw6 -c elasticsearch -- es_util --query=_cluster/health?pretty=true
          1. {
          2. "cluster_name" : "elasticsearch",
          3. "status" : "yellow", (1)
          4. "timed_out" : false,
          5. "number_of_nodes" : 3,
          6. "number_of_data_nodes" : 3,
          7. "active_primary_shards" : 8,
          8. "active_shards" : 16,
          9. "relocating_shards" : 0,
          10. "initializing_shards" : 0,
          11. "unassigned_shards" : 1,
          12. "delayed_unassigned_shards" : 0,
          13. "number_of_pending_tasks" : 0,
          14. "number_of_in_flight_fetch" : 0,
          15. "task_max_waiting_in_queue_millis" : 0,
          16. "active_shards_percent_as_number" : 100.0
          17. }
          1Make sure this parameter value is green or yellow before proceeding.
      2. If you changed the Elasticsearch configuration map, repeat these steps for each Elasticsearch pod.

      3. After all the deployments for the cluster have been rolled out, re-enable shard balancing:

        1. $ oc exec <any_es_pod_in_the_cluster> -c elasticsearch -- es_util --query="_cluster/settings" -XPUT -d '{ "persistent": { "cluster.routing.allocation.enable" : "all" } }'

        For example:

        1. $ oc exec elasticsearch-cdm-5ceex6ts-1-dcd6c4c7c-jpw6 -c elasticsearch -- es_util --query="_cluster/settings" -XPUT -d '{ "persistent": { "cluster.routing.allocation.enable" : "all" } }'

        Example output

        1. {
        2. "acknowledged" : true,
        3. "persistent" : { },
        4. "transient" : {
        5. "cluster" : {
        6. "routing" : {
        7. "allocation" : {
        8. "enable" : "all"
        9. }
        10. }
        11. }
        12. }
        13. }
      4. Scale up the Fluentd pods so they send new logs to Elasticsearch.

      Exposing the log store service as a route

      By default, the log store that is deployed with OpenShift Logging is not accessible from outside the logging cluster. You can enable a route with re-encryption termination for external access to the log store service for those tools that access its data.

      Externally, you can access the log store by creating a reencrypt route, your OKD token and the installed log store CA certificate. Then, access a node that hosts the log store service with a cURL request that contains:

      • The Authorization: Bearer ${token}

      • The Elasticsearch reencrypt route and an .

      Internally, you can access the log store service using the log store cluster IP, which you can get by using either of the following commands:

      1. $ oc get service elasticsearch -o jsonpath={.spec.clusterIP} -n openshift-logging

      Example output

      1. 172.30.183.229
      1. $ oc get service elasticsearch -n openshift-logging

      Example output

      1. NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
      2. elasticsearch ClusterIP 172.30.183.229 <none> 9200/TCP 22h

      You can check the cluster IP address with a command similar to the following:

      1. $ oc exec elasticsearch-cdm-oplnhinv-1-5746475887-fj2f8 -n openshift-logging -- curl -tlsv1.2 --insecure -H "Authorization: Bearer ${token}" "https://172.30.183.229:9200/_cat/health"

      Example output

      1. % Total % Received % Xferd Average Speed Time Time Time Current
      2. Dload Upload Total Spent Left Speed
      3. 100 29 100 29 0 0 108 0 --:--:-- --:--:-- --:--:-- 108

      Prerequisites

      • OpenShift Logging and Elasticsearch must be installed.

      • You must have access to the project to be able to access to the logs.

      Procedure

      To expose the log store externally:

      1. Change to the openshift-logging project:

        1. $ oc project openshift-logging
      2. Extract the CA certificate from the log store and write to the admin-ca file:

        1. $ oc extract secret/elasticsearch --to=. --keys=admin-ca

        Example output

        1. admin-ca
      3. Create the route for the log store service as a YAML file:

        1. Create a YAML file with the following:

          1. apiVersion: route.openshift.io/v1
          2. kind: Route
          3. metadata:
          4. name: elasticsearch
          5. namespace: openshift-logging
          6. spec:
          7. host:
          8. to:
          9. kind: Service
          10. name: elasticsearch
          11. tls:
          12. termination: reencrypt
          13. destinationCACertificate: | (1)
          1Add the log store CA certifcate or use the command in the next step. You do not have to set the spec.tls.key, spec.tls.certificate, and spec.tls.caCertificate parameters required by some reencrypt routes.
        2. Run the following command to add the log store CA certificate to the route YAML you created in the previous step:

          1. $ cat ./admin-ca | sed -e "s/^/ /" >> <file-name>.yaml
        3. Create the route:

          1. $ oc create -f <file-name>.yaml

          Example output

          1. route.route.openshift.io/elasticsearch created
      4. Check that the Elasticsearch service is exposed:

        1. Get the token of this service account to be used in the request:

          1. $ token=$(oc whoami -t)
        2. Set the elasticsearch route you created as an environment variable.

          1. $ routeES=`oc get route elasticsearch -o jsonpath={.spec.host}`
        3. To verify the route was successfully created, run the following command that accesses Elasticsearch through the exposed route:

          1. curl -tlsv1.2 --insecure -H "Authorization: Bearer ${token}" "https://${routeES}"

          Example output

          1. {
          2. "name" : "elasticsearch-cdm-i40ktba0-1",
          3. "cluster_name" : "elasticsearch",
          4. "cluster_uuid" : "0eY-tJzcR3KOdpgeMJo-MQ",
          5. "version" : {
          6. "number" : "6.8.1",
          7. "build_flavor" : "oss",
          8. "build_type" : "zip",
          9. "build_hash" : "Unknown",
          10. "build_date" : "Unknown",
          11. "build_snapshot" : true,
          12. "lucene_version" : "7.7.0",
          13. "minimum_wire_compatibility_version" : "5.6.0",
          14. "minimum_index_compatibility_version" : "5.0.0"
          15. },