Scalability Tuning
Alluxio is a scalable distributed file system designed to handle many workers within a single cluster. Several parameters can be tuned to prevent the Alluxio master from being overloaded. This page details the parameters to tune when scaling a cluster.
The Alluxio master heap size controls the total number of files that can fit into the master memory. If using the ROCKS off-heap metastore, the master heap size must be large enough to fit the inode cache. Provision roughly 1 KB of space for each inode. The following JVM options, set in , determine the respective maximum heap sizes for the Alluxio master and standby master processes to 256 GB
:
- As a rule of thumb set the min and max heap size equal to avoid heap resizing.
- Each thread spawned by the master JVM requires off heap space determined by the thread stack size. When setting the heap size, ensure that there is enough memory allocated for off heap storage. For example, spawning
4000
threads with a default thread stack size of1 MB
requires at least4 GB
of off-heap space available.
Operating System Limits
An exception message like java.lang.OutOfMemoryError: unable to create new native thread
indicates that operating system limits may need tuning.
Several parameters in the Linux kernel limit the number of threads that a process can spawn:
kernel.pid_max
: Runsysctl -w kernel.pid_max=<new value>
as rootkernel.thread_max
: Runsysctl -w kernel.thread_max=<new value>
as root- Max user process limit: Run
ulimit -u <new value>
- Max open files limit: Run
ulimit -n <new value>
- User specific pid_max limit: Run command
sudo echo <new value> > /sys/fs/cgroup/pids/user.slice/user-<userid>.slice/pids.max
as root
The frequency with which the master checks for lost workers is set by the alluxio.master.worker.heartbeat.interval
property, with a default value of 10s
. Increase the interval to reduce the number of heartbeat checks.
Heap Size
Alluxio workers require modest amounts of memory because off-heap storage is used for data storage. Therefore, a 4 GB heap is sufficient for Alluxio workers.
ALLUXIO_WORKER_JAVA_OPTS+=" -Xms4g -Xmx4g"
The frequency with which a worker checks in with the master is set by the following property:
controls the heartbeat interval for the block service in Alluxio. Again, increase the interval to reduce the number of heartbeat checks.
Keepalive Time and Timeout
alluxio.worker.network.keepalive.time=30s
alluxio.worker.network.keepalive.timeout=30s
alluxio.worker.network.keepalive.time
controls the maximum wait time since a client sent the last message before worker issues a keepalive request. alluxio.worker.network.keepalive.timeout
controls the maximum wait time after a keepalive request is sent before the worker determines the client is no longer alive and closes the connection.
The following properties tune RPC retry intervals:
The retry duration and sleep duration should be increased if frequent timeouts are observed when a client attempts to communicate with the Alluxio master.
Keepalive Time and Timeout
The Alluxio client can also be configured to check the health of connected workers using keepalive pings. This is controlled by the following properties
alluxio.user.network.keepalive.time=2h