The data locality setting is intended to be enabled in situations where at least one replica of a Longhorn volume should be scheduled on the same node as the pod that uses the volume, whenever it is possible. We refer to the property of having a local replica as having .

For example, data locality can be useful when the cluster’s network is bad, because having a local replica increases the availability of the volume.

Data locality can also be useful for distributed applications (e.g. databases), in which high availability is achieved at the application level instead of the volume level. In that case, only one volume is needed for each pod, so each volume should be scheduled on the same node as the pod that uses it. In addition, the default Longhorn behavior for volume scheduling could cause a problem for distributed applications. The problem is that if there are two replicas of a pod, and each pod replica has one volume each, Longhorn is not aware that those volumes have the same data and should not be scheduled on the same node. Therefore Longhorn could schedule identical replicas on the same node, therefore preventing them from providing high availability for the workload.

Longhorn currently supports two modes for data locality settings:

  • : This is the default option. There may or may not be a replica on the same node as the attached volume (workload).

How to Set Data Locality For Volumes

There are three ways to set data locality for Longhorn volumes:

You can change the global default setting for data locality inside Longhorn UI settings. The global setting only functions as a default value, similar to the replica count. It doesn’t change any existing volume’s settings. When a volume is created without specifying data locality, Longhorn will use the global default setting to determine data locality for the volume.

You can use Longhorn UI to set data locality for volume upon creation. You can also change the data locality setting for the volume after creation in the volume detail page.