Understanding the File Integrity Operator

    An instance of a custom resource (CR) represents a set of continuous file integrity scans for one or more nodes.

    Each FileIntegrity CR is backed by a daemon set running AIDE on the nodes matching the FileIntegrity CR specification.

    Procedure

    1. Create the following example FileIntegrity CR named worker-fileintegrity.yaml to enable scans on worker nodes:

      Example FileIntegrity CR

    2. Apply the YAML file to the openshift-file-integrity namespace:

      1. $ oc apply -f worker-fileintegrity.yaml -n openshift-file-integrity

    Verification

    • Confirm the FileIntegrity object was created successfully by running the following command:

      1. $ oc get fileintegrities -n openshift-file-integrity

      Example output

      1. NAME AGE
      2. worker-fileintegrity 14s

    Checking the FileIntegrity custom resource status

    The FileIntegrity custom resource (CR) reports its status through the .status.phase subresource.

    • To query the FileIntegrity CR status, run:

      1. $ oc get fileintegrities/worker-fileintegrity -o jsonpath="{ .status.phase }"

      Example output

      1. Active
    • Pending - The phase after the custom resource (CR) is created.

    • Active - The phase when the backing daemon set is up and running.

    • Initializing - The phase when the AIDE database is being reinitialized.

    Understanding the FileIntegrityNodeStatuses object

    The scan results of the FileIntegrity CR are reported in another object called FileIntegrityNodeStatuses.

    1. $ oc get fileintegritynodestatuses

    Example output

    1. NAME AGE
    2. worker-fileintegrity-ip-10-0-130-192.ec2.internal 101s
    3. worker-fileintegrity-ip-10-0-147-133.ec2.internal 109s
    4. worker-fileintegrity-ip-10-0-165-160.ec2.internal 102s

    There is one result object per node. The nodeName attribute of each FileIntegrityNodeStatus object corresponds to the node being scanned. The status of the file integrity scan is represented in the results array, which holds scan conditions.

    The fileintegritynodestatus object reports the latest status of an AIDE run and exposes the status as Failed, Succeeded, or Errored in a status field.

    1. $ oc get fileintegritynodestatuses -w

    Example output

    1. NAME NODE STATUS
    2. example-fileintegrity-ip-10-0-134-186.us-east-2.compute.internal ip-10-0-134-186.us-east-2.compute.internal Succeeded
    3. example-fileintegrity-ip-10-0-150-230.us-east-2.compute.internal ip-10-0-150-230.us-east-2.compute.internal Succeeded
    4. example-fileintegrity-ip-10-0-169-137.us-east-2.compute.internal ip-10-0-169-137.us-east-2.compute.internal Succeeded
    5. example-fileintegrity-ip-10-0-180-200.us-east-2.compute.internal ip-10-0-180-200.us-east-2.compute.internal Succeeded
    6. example-fileintegrity-ip-10-0-194-66.us-east-2.compute.internal ip-10-0-194-66.us-east-2.compute.internal Failed
    7. example-fileintegrity-ip-10-0-134-186.us-east-2.compute.internal ip-10-0-134-186.us-east-2.compute.internal Succeeded
    8. example-fileintegrity-ip-10-0-222-188.us-east-2.compute.internal ip-10-0-222-188.us-east-2.compute.internal Succeeded
    9. example-fileintegrity-ip-10-0-194-66.us-east-2.compute.internal ip-10-0-194-66.us-east-2.compute.internal Failed
    10. example-fileintegrity-ip-10-0-150-230.us-east-2.compute.internal ip-10-0-150-230.us-east-2.compute.internal Succeeded
    11. example-fileintegrity-ip-10-0-180-200.us-east-2.compute.internal ip-10-0-180-200.us-east-2.compute.internal Succeeded

    These conditions are reported in the results array of the corresponding FileIntegrityNodeStatus CR status:

    • Succeeded - The integrity check passed; the files and directories covered by the AIDE check have not been modified since the database was last initialized.

    • Failed - The integrity check failed; some files or directories covered by the AIDE check have been modified since the database was last initialized.

    Example output of a condition with a success status

    1. [
    2. {
    3. "condition": "Succeeded",
    4. "lastProbeTime": "2020-09-15T12:45:57Z"
    5. }
    6. ]
    7. [
    8. {
    9. "condition": "Succeeded",
    10. "lastProbeTime": "2020-09-15T12:46:03Z"
    11. }
    12. ]
    13. [
    14. {
    15. "condition": "Succeeded",
    16. "lastProbeTime": "2020-09-15T12:45:48Z"
    17. }
    18. ]

    In this case, all three scans succeeded and so far there are no other conditions.

    FileIntegrityNodeStatus CR failure status example

    To simulate a failure condition, modify one of the files AIDE tracks. For example, modify /etc/resolv.conf on one of the worker nodes:

    1. $ oc debug node/ip-10-0-130-192.ec2.internal

    Example output

    1. Creating debug namespace/openshift-debug-node-ldfbj ...
    2. Starting pod/ip-10-0-130-192ec2internal-debug ...
    3. To use host binaries, run `chroot /host`
    4. Pod IP: 10.0.130.192
    5. If you don't see a command prompt, try pressing enter.
    6. sh-4.2# echo "# integrity test" >> /host/etc/resolv.conf
    7. sh-4.2# exit
    8. Removing debug pod ...
    9. Removing debug namespace/openshift-debug-node-ldfbj ...

    After some time, the Failed condition is reported in the results array of the corresponding FileIntegrityNodeStatus object. The previous Succeeded condition is retained, which allows you to pinpoint the time the check failed.

    1. $ oc get fileintegritynodestatuses.fileintegrity.openshift.io/worker-fileintegrity-ip-10-0-130-192.ec2.internal -ojsonpath='{.results}' | jq -r

    Alternatively, if you are not mentioning the object name, run:

    1. $ oc get fileintegritynodestatuses.fileintegrity.openshift.io -ojsonpath='{.items[*].results}' | jq

    Example output

    The Failed condition points to a config map that gives more details about what exactly failed and why:

    1. $ oc describe cm aide-ds-worker-fileintegrity-ip-10-0-130-192.ec2.internal-failed

    Example output

    1. Name: aide-ds-worker-fileintegrity-ip-10-0-130-192.ec2.internal-failed
    2. Namespace: openshift-file-integrity
    3. Labels: file-integrity.openshift.io/node=ip-10-0-130-192.ec2.internal
    4. file-integrity.openshift.io/owner=worker-fileintegrity
    5. file-integrity.openshift.io/result-log=
    6. Annotations: file-integrity.openshift.io/files-added: 0
    7. file-integrity.openshift.io/files-removed: 0
    8. integritylog:
    9. ------
    10. AIDE 0.15.1 found differences between database and filesystem!!
    11. Start timestamp: 2020-09-15 12:58:15
    12. Summary:
    13. Total number of files: 31553
    14. Added files: 0
    15. Removed files: 0
    16. Changed files: 1
    17. ---------------------------------------------------
    18. Changed files:
    19. ---------------------------------------------------
    20. changed: /hostroot/etc/resolv.conf
    21. ---------------------------------------------------
    22. Detailed information about changes:
    23. ---------------------------------------------------
    24. File: /hostroot/etc/resolv.conf
    25. SHA512 : sTQYpB/AL7FeoGtu/1g7opv6C+KT1CBJ , qAeM+a8yTgHPnIHMaRlS+so61EN8VOpg
    26. Events: <none>

    Due to the config map data size limit, AIDE logs over 1 MB are added to the failure config map as a base64-encoded gzip archive. In this case, you want to pipe the output of the above command to base64 --decode | gunzip. Compressed logs are indicated by the presence of a file-integrity.openshift.io/compressed annotation key in the config map.

    Understanding events

    Transitions in the status of the FileIntegrity and FileIntegrityNodeStatus objects are logged by events. The creation time of the event reflects the latest transition, such as Initializing to Active, and not necessarily the latest scan result. However, the newest event always reflects the most recent status.

    1. $ oc get events --field-selector reason=FileIntegrityStatus

    Example output

    1. LAST SEEN TYPE REASON OBJECT MESSAGE
    2. 97s Normal FileIntegrityStatus fileintegrity/example-fileintegrity Pending
    3. 67s Normal FileIntegrityStatus fileintegrity/example-fileintegrity Initializing
    4. 37s Normal FileIntegrityStatus fileintegrity/example-fileintegrity Active

    When a node scan fails, an event is created with the add/changed/removed and config map information.

    1. $ oc get events --field-selector reason=NodeIntegrityStatus

    Example output

    1. LAST SEEN TYPE REASON OBJECT MESSAGE
    2. 114m Normal NodeIntegrityStatus fileintegrity/example-fileintegrity no changes to node ip-10-0-134-173.ec2.internal
    3. 114m Normal NodeIntegrityStatus fileintegrity/example-fileintegrity no changes to node ip-10-0-168-238.ec2.internal
    4. 114m Normal NodeIntegrityStatus fileintegrity/example-fileintegrity no changes to node ip-10-0-169-175.ec2.internal
    5. 114m Normal NodeIntegrityStatus fileintegrity/example-fileintegrity no changes to node ip-10-0-152-92.ec2.internal
    6. 114m Normal NodeIntegrityStatus fileintegrity/example-fileintegrity no changes to node ip-10-0-158-144.ec2.internal
    7. 114m Normal NodeIntegrityStatus fileintegrity/example-fileintegrity no changes to node ip-10-0-131-30.ec2.internal

    Example output