Running Alluxio on Alibaba Cloud Container Service for Kubernetes (ACK)

    This guide describes how to install and configure Alluxio on Alibaba Cloud Container Service for Kubernetes (ACK).

    • ack version >= 1.12.6

    This section introduces how to install Alluxio on Alibaba Cloud Container Service for Kubernetes (ACK) in a few steps.

    Before installing Alluxio components, you need to label the k8s node with “alluxio=true”, the steps are as follows:

    Select cluster

    Login to Container Service - Kubernetes Console. Under the Kubernetes menu, click “Clusters” > “Nodes” in the left navigation bar to enter the node list page. Select the specific cluster and click the “Manage Labels” in the upper right corner of the page.

    Select nodes

    In the node list, select nodes in batches, and then click “Add Label”. Alibaba Cloud ACK - 图5

    Add label

    Fill in the label name as “alluxio” and the value as “true”, click “OK”.

    Login to . Select “Marketplace” > “App Catalog” on the left navigation bar, and select alluxio on the right. On the “App Catalog” -> “alluxio” page, select the cluster and namespace created in the prerequisites in the creation panel on the right, and click Create.

    Use to check whether the Alluxio pods are running:

    1. # kubectl exec -ti alluxio-master-0 -n alluxio bash
    2. bash-4.4# alluxio fsadmin report capacity
    3. Capacity information for all workers:
    4. Total Capacity: 2048.00MB
    5. Tier: MEM Size: 2048.00MB
    6. Used Capacity: 0B
    7. Tier: MEM Size: 0B
    8. Used Percentage: 0%
    9. Free Percentage: 100%
    10. Worker Name Last Heartbeat Storage MEM
    11. 192.168.5.202 0 capacity 1024.00MB
    12. used 0B (0%)
    13. 192.168.5.201 0 capacity 1024.00MB
    14. used 0B (0%)

    Install spark-operator

    Go to , search for “ack-spark-operator” in the search box in the upper right: image-4.png

    Choose to install ack-spark-operator on the target cluster (the cluster in this document is “ack-create-by-openapi-1”), and then click “create”, as shown in the figure:

    Build Spark docker image

    Download the required Spark version from . The Spark version selected in this document is 2.4.6. Run the following command to download Spark:

    1. $ cd /root
    2. $ wget https://mirror.bit.edu.cn/apache/spark/spark-2.4.6/spark-2.4.6-bin-hadoop2.7.tgz

    After the download is complete, unzip the package and set env “SPARK_HOME”:

    1. $ tar -xf spark-2.4.6-bin-hadoop2.7.tgz
    2. $ export SPARK_HOME=$(pwd)/spark-2.4.6-bin-hadoop2.7

    The spark docker image is the image we used when submitting the spark task. This image needs to include the alluxio client jar package. Use the following command to obtain the alluxio client jar package:

    After the alluxio client jar package is ready, start building the image:

    1. $ docker build -t \
    2. spark-alluxio:2.4.6 -f $SPARK_HOME/kubernetes/dockerfiles/spark/Dockerfile $SPARK_HOME

    After the image is built, there are two ways to process the image:

    • If there is a private image warehouse, push the image to the private image warehouse, and ensure that the k8s cluster node can pull the image.
    • If there is no private image warehouse, you need to use the docker save command to export the image, and then scp to each node of the k8s cluster, use the docker load command on each node to import the image, so that you can ensure that each node exists The mirror.

    Upload files to Alluxio

    As mentioned at the beginning of the document: This experiment is to submit a spark job to k8s and the goal of the spark job is to count the number of occurrences of each word for a certain file. Now you need to upload the file to the alluxio storage. Here, for convenience, you can directly upload the file /opt/alluxio-2.3.0/LICENSE in the alluxio master (the file path may be slightly different due to the alluxio version) to alluxio.

    1. //The following steps are executed in the alluxio-master-0 pod
    2. bash-4.4# alluxio fs copyFromLocal LICENSE /

    Then check which workers store the blocks of LICENSE file.

    1. $ kubectl exec -ti alluxio-master-0 -n alluxio bash
    2. //The following steps are executed in the alluxio-master-0 pod
    3. bash-4.4# alluxio fs stat /LICENSE
    4. /LICENSE is a file path.
    5. FileInfo{fileId=33554431, fileIdentifier=null, name=LICENSE, path=/LICENSE, ufsPath=/opt/alluxio/underFSStorage/LICENSE, length=27040, blockSizeBytes=67108864, creationTimeMs=1592381889733, completed= true, folder=false, pinned=false, pinnedlocation=[], cacheable=true, persisted=false, blockIds=[16777216], inMemoryPercentage=100, lastModificationTimesMs=1592381890390, ttl=-1, lastAccessTimesMs=1592381890390, ttlAction=DELETE, owner=root, group=root, mode=420, persistenceState=TO_BE_PERSISTED, mountPoint=false, replicationMax=-1, replicationMin=0, fileBlockInfos=[FileBlockInfo{blockInfo=BlockInfo{id=16777216, length=27040, locations=[BlockLocation {workerId=8217561227881498090, address=WorkerNetAddress{host=192.168.8.17, containerHost=, rpcPort=29999, dataPort=29999, webPort=30000, domainSocketPath=, tieredIdentity=TieredIdentity(node=192.168.8.17, rack=null)}, tierAlias =MEM, mediumType=MEM}]}, offset=0, ufsLocations=[]}], mountId=1, inAlluxioPercentage=100, ufsFingerprint= , acl=user::rw-,group::r--,other::r--, defaultAcl=}
    6. BlockInfo{id=16777216, length=27040, locations=[BlockLocation{workerId=8217561227881498090, address=WorkerNetAddress{host=192.168.8.17, containerHost=, rpcPort=29999, dataPort=29999, webPort=30000, domainSocketPath=, tieredIdentity=TieredIdentity (node=192.168.8.17, rack=null)}, tierAlias=MEM, mediumType=MEM}]}

    As shown, this LICENSE file has only one block whose id is 16777216, placed on the k8s node of 192.168.8.17.

    We use kubectl to find out that the node name is cn-beijing.192.168.8.17:

    Submit Spark job

    The following steps will submit a spark job to the k8s cluster. The job is mainly to count the number of occurrences of each word in the /LICENSE file in alluxio.

    In step 5.3, we obtained that the blocks contained in the LICENSE file are all on the node cn-beijing.192.168.8.17. In this experiment, we specified the node selector to let the spark driver and spark executor run on the node cn-beijing. 192.168.8.17, verify that the communication between spark executor and alluxio worker is completed through the domain socket when alluxio’s short-circuit function is turned on.

    • Description: If Alluxio short-circuit operations is enabled, and the block of the spark executor and the file it wants to access (this experiment is /LICENSE) is on the same k8s node, then spark executor The communication between the alluxio client and the alluxio worker on the k8s node is done through domain socket.

    First generate a yaml file for submitting the Spark job:

    1. $ export SPARK_ALLUXIO_IMAGE="spark-alluxio:2.4.6"
    2. $ export ALLUXIO_MASTER="alluxio-master-0"
    3. $ export TARGET_NODE="cn-beijing.192.168.8.17"
    4. $ cat > /tmp/spark-example.yaml <<- EOF
    5. apiVersion: "sparkoperator.k8s.io/v1beta2"
    6. kind: SparkApplication
    7. metadata:
    8. name: spark-count-words
    9. namespace: default
    10. spec:
    11. type: Scala
    12. mode: cluster
    13. image: "$SPARK_ALLUXIO_IMAGE"
    14. imagePullPolicy: Always
    15. mainClass: org.apache.spark.examples.JavaWordCount
    16. mainApplicationFile: "local:///opt/spark/examples/jars/spark-examples_2.11-2.4.6.jar"
    17. arguments:
    18. - alluxio://${ALLUXIO_MASTER}.alluxio:19998/LICENSE
    19. sparkVersion: "2.4.6"
    20. restartPolicy:
    21. type: Never
    22. volumes:
    23. - name: "test-volume"
    24. hostPath:
    25. path: "/tmp"
    26. type: Directory
    27. - name: "alluxio-domain"
    28. hostPath:
    29. path: "/tmp/alluxio-domain"
    30. driver:
    31. cores: 1
    32. memory: "512m"
    33. labels:
    34. version: 2.4.6
    35. serviceAccount: spark
    36. volumeMounts:
    37. - name: "test-volume"
    38. mountPath: "/tmp"
    39. - name: "alluxio-domain"
    40. mountPath: "/opt/domain"
    41. nodeSelector:
    42. kubernetes.io/hostname: "$TARGET_NODE"
    43. executor:
    44. cores: 1
    45. instances: 1
    46. memory: "512m"
    47. labels:
    48. version: 2.4.6
    49. nodeSelector:
    50. kubernetes.io/hostname: "$TARGET_NODE"
    51. volumeMounts:
    52. - name: "test-volume"
    53. mountPath: "/tmp"
    54. - name: "alluxio-domain"
    55. mountPath: "/opt/domain"
    56. EOF

    Then, use sparkctl to submit the spark job:

    1. $ sparkctl create /tmp/spark-example.yaml
    • Description: if sparkctl is not installed,please refer the sparkctl to install it.

    Check Experimental Results

    After submitting the task, use kubectl to get the spark driver status:

    1. $ kubectl get po -l spark-role=driver
    2. NAME READY STATUS RESTARTS AGE
    3. spark-alluxio-1592296972094-driver 0/1 Completed 0 4h33m

    Read the spark driver log:

    1. $ kubectl exec -ti alluxio-master-0 -n alluxio bash
    2. bash-4.4# alluxio fsadmin report metrics
    3. Cluster.BytesReadAlluxio (Type: COUNTER, Value: 0B)
    4. Cluster.BytesReadAlluxioThroughput (Type: GAUGE, Value: 0B/MIN)
    5. Cluster.BytesReadDomain (Type: COUNTER, Value: 237.66KB)
    6. Cluster.BytesReadDomainThroughput (Type: GAUGE, Value: 47.53KB/MIN)

    In above logs, BytesReadAlluxio and BytesReadAlluxioThroughput represent data transmission from the network stack; BytesReadDomain and BytesReadDomainThroughput represent data transmission from the domain socket. You can see that all data is transferred from the domain socket.