Frequently Asked Questions
Alluxio, formerly Tachyon, is an open source, memory speed, virtual distributed storage. It enables any application to interact with any data from any storage system at memory speed. Read more about Alluxio .
What platforms and Java versions can Alluxio run on?
Alluxio requires JDK 1.8 or JDK 11 to run on various distributions of Linux / MacOS.
What license is Alluxio under?
Alluxio is open sourced under the Apache 2.0 license.
Why is my analytics job not running faster after deploying Alluxio?
Some possible reasons to consider:
- The job is computation bound and does not spend significant time in reading or writing data. Because the bottleneck is not in I/O performance, the benefit from faster Alluxio I/O is small.
- The persistent storage is co-located with compute (e.g. Alluxio is connected to a local HDFS) and the input data of the job is in the OS .
- Due to misconfiguration, clients are not able to identify their corresponding local Alluxio worker. This results in reading from remote Alluxio workers through the network, resulting in low data-locality.
- Input data is not loaded into Alluxio yet or already evicted, causing the job to read from the under storage instead of the Alluxio cache.
Should I deploy Alluxio as a stand-alone system or through an orchestration framework?
It is recommended to deploy Alluxio as a stand-alone system. Orchestration frameworks supported include:
Alluxio is primarily developed in Java and exposes Java-like File APIs for other applications to interact with. Alluxio supports other language bindings (experimental currently) including Python and .
What happens if my data set does not fit in memory?
It is not required for the input data set to fit in Alluxio storage space in order for applications to work. Alluxio will transparently load data on demand from the under storage. To help fit more data in Alluxio’s storage space, configure Alluxio to leverage other storage resources such as SSD and HDD in addition to memory to extend Alluxio storage capacity. Read more about Alluxio storage setup .
Does Alluxio support a high availability mode?
Yes. See instructions about .
Will Alluxio rebalance cached blocks to the newly added nodes in order to balance memory space utilization?
No, rebalancing of data blocks in Alluxio is not currently supported.
How can I add support for other under store systems?
Support for other under storages is in progress by many contributors. See the documentation for adding other under storage systems.
No, Alluxio can run on many under storage systems such as Amazon S3 or Swift in addition to HDFS.
How can I learn more about Alluxio?
Join the Alluxio community Slack Channel to chat with users and developers.
Read the recent blogs and .
Join the meetup group for Alluxio at http://www.meetup.com/Alluxio/. Other Alluxio events can be found .
Where can I report issues or propose new features?
is used to track feature development and issues. To report an issue or propose a feature, post on the Github issue.
Where can I get more help?
For any questions related to installation, contribution or feedback, please join our or send an email to the Alluxio User Mailing List. We look forward to seeing you there.
How can I contribute to Alluxio?
Thank you for your interest in contributing. Please read our contributor guide.