Topology awareness
- The topology information should be configured by Ozone.
- When Ozone reads a Key it should prefer to read from the closest node.
Ozone uses RAFT replication for Open containers (write), and an async replication for closed, immutable containers (cold data). As RAFT requires low-latency network, topology awareness placement is available only for closed containers. See the page about Containers about more information related to Open vs Closed containers.
Topology hierarchy can be configured with using configuration key. This configuration should define an implementation of the org.apache.hadoop.net.CachedDNSToSwitchMapping
. As this is a Hadoop class, the configuration is exactly the same as the Hadoop Configuration
Static list can be configured with the help of TableMapping
:
Dynamic list
Rack information can be identified with the help of an external script:
If implementing an external script, it will be specified with the parameter in the configuration files. Unlike the java class, the external topology script is not included with the Ozone distribution and is provided by the administrator. Ozone will send multiple IP addresses to ARGV when forking the topology script. The number of IP addresses sent to the topology script is controlled with net.topology.script.number.args
and defaults to 100. If net.topology.script.number.args
was changed to 1, a topology script would get forked for each IP submitted.
Placement of the closed containers can be configured with ozone.scm.container.placement.impl
configuration key. The available container placement policies can be found in the package.
This placement policy complies with the algorithm used in HDFS. With default 3 replica, two replicas will be on the same rack, the third one will on a different rack.
This implementation applies to network topology like “/rack/node”. Don’t recommend to use this if the network topology has more layers.
Finally the read path also should be configured to read the data from the closest pipeline.
- Hadoop documentation about
net.topology.node.switch.mapping.impl
: - Design doc