The Node Feature Discovery Operator
The Node Feature Discovery Operator (NFD) manages the detection of hardware features and configuration in a OKD cluster by labeling the nodes with hardware-specific information. NFD labels the host with node-specific attributes, such as PCI cards, kernel, operating system version, and so on.
The NFD Operator can be found on the Operator Hub by searching for “Node Feature Discovery”.
The Node Feature Discovery (NFD) Operator orchestrates all resources needed to run the NFD daemon set. As a cluster administrator, you can install the NFD Operator using the OKD CLI or the web console.
As a cluster administrator, you can install the NFD Operator using the CLI.
Prerequisites
An OKD cluster
Install the OpenShift CLI ().
Log in as a user with
cluster-admin
privileges.
Procedure
Create a namespace for the NFD Operator.
Create the following
Namespace
custom resource (CR) that defines theopenshift-nfd
namespace, and then save the YAML in thenfd-namespace.yaml
file:Create the namespace by running the following command:
$ oc create -f nfd-namespace.yaml
Install the NFD Operator in the namespace you created in the previous step by creating the following objects:
Create the following
OperatorGroup
CR and save the YAML in thenfd-operatorgroup.yaml
file:apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
generateName: openshift-nfd-
name: openshift-nfd
namespace: openshift-nfd
spec:
targetNamespaces:
- openshift-nfd
Create the
OperatorGroup
CR by running the following command:$ oc create -f nfd-operatorgroup.yaml
Run the following command to get the
channel
value required for the next step.$ oc get packagemanifest nfd -n openshift-marketplace -o jsonpath='{.status.defaultChannel}'
Example output
4.8
Create the following
Subscription
CR and save the YAML in thenfd-sub.yaml
file:Example Subscription
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: nfd
namespace: openshift-nfd
spec:
channel: "4.8"
installPlanApproval: Automatic
name: nfd
source: redhat-operators
sourceNamespace: openshift-marketplace
Create the subscription object by running the following command:
$ oc create -f nfd-sub.yaml
Change to the
openshift-nfd
project:$ oc project openshift-nfd
Verification
To verify that the Operator deployment is successful, run:
Example output
NAME READY STATUS RESTARTS AGE
nfd-controller-manager-7f86ccfb58-vgr4x 2/2 Running 0 10m
A successful deployment shows a
Running
status.
Installing the NFD Operator using the web console
As a cluster administrator, you can install the NFD Operator using the web console.
Procedure
In the OKD web console, click Operators → OperatorHub.
Choose Node Feature Discovery from the list of available Operators, and then click Install.
On the Install Operator page, select a specific namespace on the cluster, select the namespace created in the previous section, and then click Install.
Verification
To verify that the NFD Operator installed successfully:
Navigate to the Operators → Installed Operators page.
Ensure that Node Feature Discovery is listed in the openshift-nfd project with a Status of InstallSucceeded.
During installation an Operator might display a Failed status. If the installation later succeeds with an InstallSucceeded message, you can ignore the Failed message.
Troubleshooting
If the Operator does not appear as installed, troubleshoot further:
Navigate to the Operators → Installed Operators page and inspect the Operator Subscriptions and Install Plans tabs for any failure or errors under Status.
Navigate to the Workloads → Pods page and check the logs for pods in the
openshift-nfd
project.
The Node Feature Discovery (NFD) Operator orchestrates all resources needed to run the Node-Feature-Discovery daemon set by watching for a NodeFeatureDiscovery
CR. Based on the NodeFeatureDiscovery
CR, the Operator will create the operand (NFD) components in the desired namespace. You can edit the CR to choose another namespace
, image
, imagePullPolicy
, and nfd-worker-conf
, among other options.
As a cluster administrator, you can create a NodeFeatureDiscovery
instance using the OKD CLI or the web console.
As a cluster administrator, you can create a NodeFeatureDiscovery
CR instance using the CLI.
Prerequisites
An OKD cluster
Install the OpenShift CLI (
oc
).Log in as a user with
cluster-admin
privileges.
Procedure
-
apiVersion: nfd.openshift.io/v1
kind: NodeFeatureDiscovery
metadata:
name: nfd-instance
namespace: openshift-nfd
spec:
instance: "" # instance is empty by default
operand:
namespace: openshift-nfd
image: quay.io/openshift/origin-node-feature-discovery:4.8
imagePullPolicy: Always
workerConfig:
configData: |
#core:
# labelWhiteList:
# noPublish: false
# sleepInterval: 60s
# sources: [all]
# klog:
# addDirHeader: false
# alsologtostderr: false
# logBacktraceAt:
# logtostderr: true
# skipHeaders: false
# stderrthreshold: 2
# v: 0
# vmodule:
## NOTE: the following options are not dynamically run-time configurable
## and require a nfd-worker restart to take effect after being changed
# logDir:
# logFile:
# logFileMaxSize: 1800
# skipLogHeaders: false
#sources:
# cpu:
# cpuid:
## NOTE: whitelist has priority over blacklist
# attributeBlacklist:
# - "BMI1"
# - "BMI2"
# - "CLMUL"
# - "CMOV"
# - "CX16"
# - "ERMS"
# - "HTT"
# - "LZCNT"
# - "MMX"
# - "MMXEXT"
# - "NX"
# - "POPCNT"
# - "RDRAND"
# - "RDSEED"
# - "RDTSCP"
# - "SGX"
# - "SSE"
# - "SSE2"
# - "SSE3"
# - "SSE4.1"
# - "SSE4.2"
# - "SSSE3"
# attributeWhitelist:
# kernel:
# kconfigFile: "/path/to/kconfig"
# configOpts:
# - "NO_HZ"
# - "X86"
# - "DMI"
# pci:
# deviceClassWhitelist:
# - "0200"
# - "03"
# - "12"
# deviceLabelFields:
# - "class"
# - "vendor"
# - "device"
# - "subsystem_vendor"
# - "subsystem_device"
# usb:
# deviceClassWhitelist:
# - "0e"
# - "ef"
# - "fe"
# - "ff"
# deviceLabelFields:
# - "class"
# - "device"
# custom:
# - name: "my.kernel.feature"
# matchOn:
# - loadedKMod: ["example_kmod1", "example_kmod2"]
# - name: "my.pci.feature"
# matchOn:
# - pciId:
# class: ["0200"]
# vendor: ["15b3"]
# device: ["1014", "1017"]
# - pciId :
# vendor: ["8086"]
# device: ["1000", "1100"]
# - name: "my.usb.feature"
# matchOn:
# - usbId:
# class: ["ff"]
# vendor: ["03e7"]
# device: ["2485"]
# - usbId:
# class: ["fe"]
# vendor: ["1a6e"]
# device: ["089a"]
# - name: "my.combined.feature"
# matchOn:
# - pciId:
# vendor: ["15b3"]
# device: ["1014", "1017"]
# loadedKMod : ["vendor_kmod1", "vendor_kmod2"]
customConfig:
configData: |
# - name: "more.kernel.features"
# matchOn:
# - loadedKMod: ["example_kmod3"]
# - name: "more.features.by.nodename"
# value: customValue
# matchOn:
# - nodename: ["special-.*-node-.*"]
Create the
NodeFeatureDiscovery
CR instance by running the following command:$ oc create -f NodeFeatureDiscovery.yaml
Verification
To verify that the instance is created, run:
$ oc get pods
Example output
NAME READY STATUS RESTARTS AGE
nfd-controller-manager-7f86ccfb58-vgr4x 2/2 Running 0 11m
nfd-master-hcn64 1/1 Running 0 60s
nfd-master-lnnxx 1/1 Running 0 60s
nfd-master-mp6hr 1/1 Running 0 60s
nfd-worker-vgcz9 1/1 Running 0 60s
nfd-worker-xqbws 1/1 Running 0 60s
A successful deployment shows a
Running
status.
Create a NodeFeatureDiscovery CR using the web console
Procedure
Navigate to the Operators → Installed Operators page.
Find Node Feature Discovery and see a box under Provided APIs.
Click Create instance.
Edit the values of the
NodeFeatureDiscovery
CR.Click Create.
The core
section contains common configuration settings that are not specific to any particular feature source.
core.sleepInterval
core.sleepInterval
specifies the interval between consecutive passes of feature detection or re-detection, and thus also the interval between node re-labeling. A non-positive value implies infinite sleep interval; no re-detection or re-labeling is done.
This value is overridden by the deprecated --sleep-interval
command line flag, if specified.
Example usage
core:
The default value is 60s
.
core.sources
core.sources
specifies the list of enabled feature sources. A special value all
enables all feature sources.
This value is overridden by the deprecated --sources
command line flag, if specified.
Default: [all]
Example usage
core:
sources:
- system
- custom
core.labelWhiteList
core.labelWhiteList
specifies a regular expression for filtering feature labels based on the label name. Non-matching labels are not published.
The regular expression is only matched against the basename part of the label, the part of the name after ‘/‘. The label prefix, or namespace, is omitted.
This value is overridden by the deprecated --label-whitelist
command line flag, if specified.
Default: null
Example usage
core:
labelWhiteList: '^cpu-cpuid'
core.noPublish
Setting core.noPublish
to true
disables all communication with the nfd-master
. It is effectively a dry run flag; nfd-worker
runs feature detection normally, but no labeling requests are sent to nfd-master
.
This value is overridden by the --no-publish
command line flag, if specified.
Example:
Example usage
The default value is false
.
core.klog
The following options specify the logger configuration, most of which can be dynamically adjusted at run-time.
The logger options can also be specified using command line flags, which take precedence over any corresponding config file options.
core.klog.addDirHeader
If set to true
, core.klog.addDirHeader
adds the file directory to the header of the log messages.
Default: false
Run-time configurable: yes
core.klog.alsologtostderr
Log to standard error as well as files.
Default: false
Run-time configurable: yes
core.klog.logBacktraceAt
When logging hits line file:N, emit a stack trace.
Default: empty
Run-time configurable: yes
core.klog.logDir
If non-empty, write log files in this directory.
Default: empty
Run-time configurable: no
core.klog.logFile
If not empty, use this log file.
Default: empty
Run-time configurable: no
core.klog.logFileMaxSize
core.klog.logFileMaxSize
defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0
, the maximum file size is unlimited.
Default: 1800
core.klog.logtostderr
Log to standard error instead of files
Default: true
Run-time configurable: yes
core.klog.skipHeaders
If core.klog.skipHeaders
is set to true
, avoid header prefixes in the log messages.
Default: false
Run-time configurable: yes
core.klog.skipLogHeaders
If core.klog.skipLogHeaders
is set to true
, avoid headers when opening log files.
Default: false
Run-time configurable: no
core.klog.stderrthreshold
Logs at or above this threshold go to stderr.
Default: 2
Run-time configurable: yes
core.klog.v
core.klog.v
is the number for the log level verbosity.
Default: 0
Run-time configurable: yes
core.klog.vmodule
core.klog.vmodule
is a comma-separated list of pattern=N
settings for file-filtered logging.
Default: empty
Run-time configurable: yes
The sources
section contains feature source specific configuration parameters.
sources.cpu.cpuid.attributeBlacklist
Prevent publishing cpuid
features listed in this option.
This value is overridden by sources.cpu.cpuid.attributeWhitelist
, if specified.
Default: [BMI1, BMI2, CLMUL, CMOV, CX16, ERMS, F16C, HTT, LZCNT, MMX, MMXEXT, NX, POPCNT, RDRAND, RDSEED, RDTSCP, SGX, SGXLC, SSE, SSE2, SSE3, SSE4.1, SSE4.2, SSSE3]
Example usage
sources:
cpu:
cpuid:
attributeBlacklist: [MMX, MMXEXT]
sources.cpu.cpuid.attributeWhitelist
Only publish the cpuid
features listed in this option.
sources.cpu.cpuid.attributeWhitelist
takes precedence over sources.cpu.cpuid.attributeBlacklist
.
Default: empty
Example usage
sources:
cpu:
cpuid:
attributeWhitelist: [AVX512BW, AVX512CD, AVX512DQ, AVX512F, AVX512VL]
sources.kernel.kconfigFile
sources.kernel.kconfigFile
is the path of the kernel config file. If empty, NFD runs a search in the well-known standard locations.
Default: empty
Example usage
sources:
kernel:
kconfigFile: "/path/to/kconfig"
sources.kernel.configOpts
sources.kernel.configOpts
represents kernel configuration options to publish as feature labels.
Default: [NO_HZ, NO_HZ_IDLE, NO_HZ_FULL, PREEMPT]
Example usage
sources:
kernel:
configOpts: [NO_HZ, X86, DMI]
soures.pci.deviceClassWhitelist
soures.pci.deviceClassWhitelist
is a list of PCI device class IDs for which to publish a label. It can be specified as a main class only (for example, 03
) or full class-subclass combination (for example 0300
). The former implies that all subclasses are accepted. The format of the labels can be further configured with deviceLabelFields
.
Default: ["03", "0b40", "12"]
Example usage
sources:
pci:
deviceClassWhitelist: ["0200", "03"]
soures.pci.deviceLabelFields
soures.pci.deviceLabelFields
is the set of PCI ID fields to use when constructing the name of the feature label. Valid fields are class
, vendor
, device
, subsystem_vendor
and subsystem_device
.
Default: [class, vendor]
Example usage
sources:
pci:
deviceLabelFields: [class, vendor, device]
With the example config above, NFD would publish labels such as feature.node.kubernetes.io/pci-<class-id>_<vendor-id>_<device-id>.present=true
soures.usb.deviceClassWhitelist
soures.usb.deviceClassWhitelist
is a list of USB IDs for which to publish a feature label. The format of the labels can be further configured with deviceLabelFields
.
Default: ["0e", "ef", "fe", "ff"]
Example usage
sources:
usb:
deviceClassWhitelist: ["ef", "ff"]
soures.usb.deviceLabelFields
soures.usb.deviceLabelFields
is the set of USB ID fields from which to compose the name of the feature label. Valid fields are class
, vendor
, and device
.
Default: [class, vendor, device]
Example usage
sources:
pci:
With the example config above, NFD would publish labels like: feature.node.kubernetes.io/usb-<class-id>_<vendor-id>.present=true
.
soures.custom
is the list of rules to process in the custom feature source to create user-specific labels.
Default: empty