Scalability Tuning

    Alluxio is a scalable distributed file system designed to handle many workers within a single cluster. Several parameters can be tuned to prevent the Alluxio master from being overloaded. This page details the parameters to tune when scaling a cluster.

    The Alluxio master heap size controls the total number of files that can fit into the master memory. If using the ROCKS off-heap metastore, the master heap size must be large enough to fit the inode cache. Provision roughly 1 KB of space for each inode. The following JVM options, set in , determine the respective maximum heap sizes for the Alluxio master and standby master processes to 256 GB:

    • As a rule of thumb set the min and max heap size equal to avoid heap resizing.
    • Each thread spawned by the master JVM requires off heap space determined by the thread stack size. When setting the heap size, ensure that there is enough memory allocated for off heap storage. For example, spawning 4000 threads with a default thread stack size of 1 MB requires at least 4 GB of off-heap space available.

    Operating System Limits

    An exception message like java.lang.OutOfMemoryError: unable to create new native thread indicates that operating system limits may need tuning.

    Several parameters in the Linux kernel limit the number of threads that a process can spawn:

    • kernel.pid_max: Run sysctl -w kernel.pid_max=<new value> as root
    • kernel.thread_max: Run sysctl -w kernel.thread_max=<new value> as root
    • Max user process limit: Run ulimit -u <new value>
    • Max open files limit: Run ulimit -n <new value>
    • User specific pid_max limit: Run command sudo echo <new value> > /sys/fs/cgroup/pids/user.slice/user-<userid>.slice/pids.max as root

    The frequency with which the master checks for lost workers is set by the alluxio.master.worker.heartbeat.interval property, with a default value of 10s. Increase the interval to reduce the number of heartbeat checks.

    Heap Size

    Alluxio workers require modest amounts of memory because off-heap storage is used for data storage. Therefore, a 4 GB heap is sufficient for Alluxio workers.

    1. ALLUXIO_WORKER_JAVA_OPTS+=" -Xms4g -Xmx4g"

    The frequency with which a worker checks in with the master is set by the following property:

    controls the heartbeat interval for the block service in Alluxio. Again, increase the interval to reduce the number of heartbeat checks.

    Keepalive Time and Timeout

    1. alluxio.worker.network.keepalive.time=30s
    2. alluxio.worker.network.keepalive.timeout=30s

    alluxio.worker.network.keepalive.time controls the maximum wait time since a client sent the last message before worker issues a keepalive request. alluxio.worker.network.keepalive.timeout controls the maximum wait time after a keepalive request is sent before the worker determines the client is no longer alive and closes the connection.

    The following properties tune RPC retry intervals:

    The retry duration and sleep duration should be increased if frequent timeouts are observed when a client attempts to communicate with the Alluxio master.

    Keepalive Time and Timeout

    The Alluxio client can also be configured to check the health of connected workers using keepalive pings. This is controlled by the following properties

    1. alluxio.user.network.keepalive.time=2h