Coarse Parallel Processing Using a Work Queue

    In this example, as each pod is created, it picks up one unit of work from a task queue, completes it, deletes it from the queue, and exits.

    Here is an overview of the steps in this example:

    1. Start a message queue service. In this example, we use RabbitMQ, but you could use another one. In practice you would set up a message queue service once and reuse it for many jobs.
    2. Create a queue, and fill it with messages. Each message represents one task to be done. In this example, a message is an integer that we will do a lengthy computation on.
    3. Start a Job that works on tasks from the queue. The Job starts several pods. Each pod takes one task from the message queue, processes it, and repeats until the end of the queue is reached.

    Be familiar with the basic, non-parallel, use of Job.

    You need to have a Kubernetes cluster, and the kubectl command-line tool must be configured to communicate with your cluster. It is recommended to run this tutorial on a cluster with at least two nodes that are not acting as control plane hosts. If you do not already have a cluster, you can create one by using or you can use one of these Kubernetes playgrounds:

    Starting a message queue service

    This example uses RabbitMQ, however, you can adapt the example to use another AMQP-type message service.

    In practice you could set up a message queue service once in a cluster and reuse it for many jobs, as well as for long-running services.

    Start RabbitMQ as follows:

    1. kubectl create -f https://raw.githubusercontent.com/kubernetes/kubernetes/release-1.3/examples/celery-rabbitmq/rabbitmq-controller.yaml
    1. replicationcontroller "rabbitmq-controller" created

    We will only use the rabbitmq part from the .

    Testing the message queue service

    Now, we can experiment with accessing the message queue. We will create a temporary interactive pod, install some tools on it, and experiment with queues.

    First create a temporary interactive Pod.

    1. # Create a temporary interactive container
    2. kubectl run -i --tty temp --image ubuntu:18.04
    1. Waiting for pod default/temp-loe07 to be running, status is Pending, pod ready: false
    2. ... [ previous line repeats several times .. hit return when it stops ] ...

    Note that your pod name and command prompt will be different.

    Next install the amqp-tools so we can work with message queues.

    1. # Install some tools
    2. root@temp-loe07:/# apt-get update
    3. .... [ lots of output ] ....
    4. root@temp-loe07:/# apt-get install -y curl ca-certificates amqp-tools python dnsutils
    5. .... [ lots of output ] ....

    Later, we will make a docker image that includes these packages.

    If Kube-DNS is not setup correctly, the previous step may not work for you. You can also find the service IP in an env var:

    1. # env | grep RABBIT | grep HOST
    2. RABBITMQ_SERVICE_SERVICE_HOST=10.0.147.152
    3. # Your address will vary.

    Next we will verify we can create a queue, and publish and consume messages.

    1. # In the next line, rabbitmq-service is the hostname where the rabbitmq-service
    2. # can be reached. 5672 is the standard port for rabbitmq.
    3. root@temp-loe07:/# export BROKER_URL=amqp://guest:guest@rabbitmq-service:5672
    4. # If you could not resolve "rabbitmq-service" in the previous step,
    5. # then use this command instead:
    6. # root@temp-loe07:/# BROKER_URL=amqp://guest:guest@$RABBITMQ_SERVICE_SERVICE_HOST:5672
    7. # Now create a queue:
    8. root@temp-loe07:/# /usr/bin/amqp-declare-queue --url=$BROKER_URL -q foo -d
    9. foo
    10. root@temp-loe07:/# /usr/bin/amqp-publish --url=$BROKER_URL -r foo -p -b Hello
    11. # And get it back.
    12. root@temp-loe07:/# /usr/bin/amqp-consume --url=$BROKER_URL -q foo -c 1 cat && echo
    13. Hello
    14. root@temp-loe07:/#

    In the last command, the amqp-consume tool takes one message (-c 1) from the queue, and passes that message to the standard input of an arbitrary command. In this case, the program prints out the characters read from standard input, and the echo adds a carriage return so the example is readable.

    Now let’s fill the queue with some “tasks”. In our example, our tasks are strings to be printed.

    In a practice, the content of the messages might be:

    • names of files to that need to be processed
    • extra flags to the program
    • ranges of keys in a database table
    • configuration parameters to a simulation
    • frame numbers of a scene to be rendered

    In practice, if there is large data that is needed in a read-only mode by all pods of the Job, you will typically put that in a shared file system like NFS and mount that readonly on all the pods, or the program in the pod will natively read data from a cluster file system like HDFS.

    For our example, we will create the queue and fill it using the amqp command line tools. In practice, you might write a program to fill the queue using an amqp client library.

    1. /usr/bin/amqp-declare-queue --url=$BROKER_URL -q job1 -d
    2. job1
    1. for f in apple banana cherry date fig grape lemon melon
    2. do
    3. /usr/bin/amqp-publish --url=$BROKER_URL -r job1 -p -b $f
    4. done

    So, we filled the queue with 8 messages.

    Create an Image

    Now we are ready to create an image that we will run as a job.

    We will use the amqp-consume utility to read the message from the queue and run our actual program. Here is a very simple example program:

    application/job/rabbitmq/worker.py

    1. #!/usr/bin/env python
    2. # Just prints standard out and sleeps for 10 seconds.
    3. import sys
    4. import time
    5. print("Processing " + sys.stdin.readlines()[0])
    6. time.sleep(10)

    Give the script execution permission:

    1. chmod +x worker.py

    Now, build an image. If you are working in the source tree, then change directory to examples/job/work-queue-1. Otherwise, make a temporary directory, change to it, download the , and worker.py. In either case, build the image with this command:

    1. docker tag job-wq-1 <username>/job-wq-1
    2. docker push <username>/job-wq-1

    If you are using , tag your app image with your project ID, and push to GCR. Replace <project> with your project ID.

    1. docker tag job-wq-1 gcr.io/<project>/job-wq-1
    2. gcloud docker -- push gcr.io/<project>/job-wq-1

    Defining a Job

    Here is a job definition. You’ll need to make a copy of the Job and edit the image to match the name you used, and call it ./job.yaml.

    Coarse Parallel Processing Using a Work Queue - 图2

    1. apiVersion: batch/v1
    2. kind: Job
    3. metadata:
    4. name: job-wq-1
    5. spec:
    6. completions: 8
    7. parallelism: 2
    8. template:
    9. metadata:
    10. name: job-wq-1
    11. spec:
    12. - name: c
    13. image: gcr.io/<project>/job-wq-1
    14. env:
    15. - name: BROKER_URL
    16. - name: QUEUE
    17. value: job1
    18. restartPolicy: OnFailure

    In this example, each pod works on one item from the queue and then exits. So, the completion count of the Job corresponds to the number of work items done. So we set, .spec.completions: 8 for the example, since we put 8 items in the queue.

    So, now run the Job:

    1. kubectl apply -f ./job.yaml

    Now wait a bit, then check on the job.

    1. kubectl describe jobs/job-wq-1
    1. Name: job-wq-1
    2. Namespace: default
    3. Selector: controller-uid=41d75705-92df-11e7-b85e-fa163ee3c11f
    4. Labels: controller-uid=41d75705-92df-11e7-b85e-fa163ee3c11f
    5. job-name=job-wq-1
    6. Annotations: <none>
    7. Parallelism: 2
    8. Completions: 8
    9. Start Time: Wed, 06 Sep 2017 16:42:02 +0800
    10. Pods Statuses: 0 Running / 8 Succeeded / 0 Failed
    11. Pod Template:
    12. Labels: controller-uid=41d75705-92df-11e7-b85e-fa163ee3c11f
    13. job-name=job-wq-1
    14. Containers:
    15. c:
    16. Image: gcr.io/causal-jigsaw-637/job-wq-1
    17. Port:
    18. Environment:
    19. BROKER_URL: amqp://guest:guest@rabbitmq-service:5672
    20. QUEUE: job1
    21. Mounts: <none>
    22. Volumes: <none>
    23. Events:
    24. FirstSeen LastSeen Count From SubobjectPath Type Reason Message
    25. ───────── ──────── ───── ──── ───────────── ────── ────── ───────
    26. 27s 27s 1 {job } Normal SuccessfulCreate Created pod: job-wq-1-hcobb
    27. 27s 27s 1 {job } Normal SuccessfulCreate Created pod: job-wq-1-weytj
    28. 27s 27s 1 {job } Normal SuccessfulCreate Created pod: job-wq-1-qaam5
    29. 27s 27s 1 {job } Normal SuccessfulCreate Created pod: job-wq-1-b67sr
    30. 26s 26s 1 {job } Normal SuccessfulCreate Created pod: job-wq-1-xe5hj
    31. 15s 15s 1 {job } Normal SuccessfulCreate Created pod: job-wq-1-w2zqe
    32. 14s 14s 1 {job } Normal SuccessfulCreate Created pod: job-wq-1-d6ppa

    All our pods succeeded. Yay.

    Alternatives

    This approach has the advantage that you do not need to modify your “worker” program to be aware that there is a work queue.

    It does require that you run a message queue service. If running a queue service is inconvenient, you may want to consider one of the other .

    This approach creates a pod for every work item. If your work items only take a few seconds, though, creating a Pod for every work item may add a lot of overhead. Consider another example, that executes multiple work items per Pod.

    In this example, we use the utility to read the message from the queue and run our actual program. This has the advantage that you do not need to modify your program to be aware of the queue. A , shows how to communicate with the work queue using a client library.

    Caveats

    If the number of completions is set to less than the number of items in the queue, then not all items will be processed.

    If the number of completions is set to more than the number of items in the queue, then the Job will not appear to be completed, even though all items in the queue have been processed. It will start additional pods which will block waiting for a message.