Indexed Job for Parallel Processing with Static Work Assignment

    In this example, you will run a Kubernetes Job that uses multiple parallel worker processes. Each worker is a different container running in its own Pod. The Pods have an index number that the control plane sets automatically, which allows each Pod to identify which part of the overall task to work on.

    The pod index is available in the annotation batch.kubernetes.io/job-completion-index as a string representing its decimal value. In order for the containerized task process to obtain this index, you can publish the value of the annotation using the mechanism. For convenience, the control plane automatically sets the downward API to expose the index in the JOB_COMPLETION_INDEX environment variable.

    Here is an overview of the steps in this example:

    1. Define a Job manifest using indexed completion. The downward API allows you to pass the pod index annotation as an environment variable or file to the container.
    2. Start an Indexed Job based on that manifest.

    You should already be familiar with the basic, non-parallel, use of Job.

    You need to have a Kubernetes cluster, and the kubectl command-line tool must be configured to communicate with your cluster. It is recommended to run this tutorial on a cluster with at least two nodes that are not acting as control plane hosts. If you do not already have a cluster, you can create one by using or you can use one of these Kubernetes playgrounds:

    Your Kubernetes server must be at or later than version v1.21. To check the version, enter kubectl version.

    1. Read the JOB_COMPLETION_INDEX environment variable. The Job controller automatically links this variable to the annotation containing the completion index.
    2. Read a file that contains the completion index.
    3. Assuming that you can’t modify the program, you can wrap it with a script that reads the index using any of the methods above and converts it into something that the program can use as input.

    For this example, imagine that you chose option 3 and you want to run the utility. This program accepts a file as an argument and prints its content reversed.

    You’ll use the rev tool from the busybox container image.

    As this is only an example, each Pod only does a tiny piece of work (reversing a short string). In a real workload you might, for example, create a Job that represents the task of producing 60 seconds of video based on scene data. Each work item in the video rendering Job would be to render a particular frame of that video clip. Indexed completion would mean that each Pod in the Job knows which frame to render and publish, by counting frames from the start of the clip.

    Here is a sample Job manifest that uses Indexed completion mode:

    1. apiVersion: batch/v1
    2. kind: Job
    3. metadata:
    4. name: 'indexed-job'
    5. spec:
    6. completions: 5
    7. parallelism: 3
    8. completionMode: Indexed
    9. template:
    10. spec:
    11. restartPolicy: Never
    12. initContainers:
    13. - name: 'input'
    14. image: 'docker.io/library/bash'
    15. command:
    16. - "bash"
    17. - "-c"
    18. - |
    19. items=(foo bar baz qux xyz)
    20. echo ${items[$JOB_COMPLETION_INDEX]} > /input/data.txt
    21. volumeMounts:
    22. - mountPath: /input
    23. containers:
    24. - name: 'worker'
    25. image: 'docker.io/library/busybox'
    26. - "rev"
    27. - "/input/data.txt"
    28. volumeMounts:
    29. - mountPath: /input
    30. name: input
    31. volumes:
    32. - name: input
    33. emptyDir: {}

    In the example above, you use the builtin JOB_COMPLETION_INDEX environment variable set by the Job controller for all containers. An init container maps the index to a static value and writes it to a file that is shared with the container running the worker through an . Optionally, you can define your own environment variable through the downward API to publish the index to containers. You can also choose to load a list of values from a .

    application/job/indexed-job-vol.yaml Indexed Job for Parallel Processing with Static Work Assignment - 图2

    Now run the Job:

    1. # This uses the first approach (relying on $JOB_COMPLETION_INDEX)
    2. kubectl apply -f https://kubernetes.io/examples/application/job/indexed-job.yaml

    When you create this Job, the control plane creates a series of Pods, one for each index you specified. The value of .spec.parallelism determines how many can run at once whereas .spec.completions determines how many Pods the Job creates in total.

    Because .spec.parallelism is less than .spec.completions, the control plane waits for some of the first Pods to complete before starting more of them.

    Once you have created the Job, wait a moment then check on progress:

    The output is similar to:

    1. Name: indexed-job
    2. Namespace: default
    3. Selector: controller-uid=bf865e04-0b67-483b-9a90-74cfc4c3e756
    4. Labels: controller-uid=bf865e04-0b67-483b-9a90-74cfc4c3e756
    5. job-name=indexed-job
    6. Annotations: <none>
    7. Parallelism: 3
    8. Completions: 5
    9. Start Time: Thu, 11 Mar 2021 15:47:34 +0000
    10. Pods Statuses: 2 Running / 3 Succeeded / 0 Failed
    11. Completed Indexes: 0-2
    12. Pod Template:
    13. Labels: controller-uid=bf865e04-0b67-483b-9a90-74cfc4c3e756
    14. job-name=indexed-job
    15. Init Containers:
    16. input:
    17. Image: docker.io/library/bash
    18. Port: <none>
    19. bash
    20. -c
    21. items=(foo bar baz qux xyz)
    22. echo ${items[$JOB_COMPLETION_INDEX]} > /input/data.txt
    23. Environment: <none>
    24. Mounts:
    25. /input from input (rw)
    26. Containers:
    27. worker:
    28. Image: docker.io/library/busybox
    29. Port: <none>
    30. Host Port: <none>
    31. Command:
    32. rev
    33. /input/data.txt
    34. Environment: <none>
    35. Mounts:
    36. /input from input (rw)
    37. Volumes:
    38. input:
    39. Type: EmptyDir (a temporary directory that shares a pod's lifetime)
    40. Medium:
    41. SizeLimit: <unset>
    42. Events:
    43. Type Reason Age From Message
    44. ---- ------ ---- ---- -------
    45. Normal SuccessfulCreate 4s job-controller Created pod: indexed-job-njkjj
    46. Normal SuccessfulCreate 4s job-controller Created pod: indexed-job-9kd4h
    47. Normal SuccessfulCreate 4s job-controller Created pod: indexed-job-qjwsz
    48. Normal SuccessfulCreate 1s job-controller Created pod: indexed-job-fdhq5
    49. Normal SuccessfulCreate 1s job-controller Created pod: indexed-job-ncslj

    The output is similar to: