Usage - Distributed Coach

    Coach uses three interfaces to orchestrate, schedule and manager the resources of workers it spawns in the distributedmode. These interfaces are the orchestrator, memory backend and the data store. Refer to Distributed Coach - Horizontal Scale-Out formore information. The following implementation(s) are available for each interface:

    Prerequisites

    • Building and pushing containers - Docker.

    • Container registry access for hosting container images -

    • Using S3 for storing policy checkpoints - AWS CLI _,`AWS credentialsand .

    Build Container Image and Push

    Create a directory docker.

    1. $ mkdir docker

    Create docker files in the docker directory.

    A sample base docker file (Dockerfile.base) would look like this:

    1. FROM nvidia/cuda:9.0-cudnn7-runtime-ubuntu16.04
    2.  
    3. ################################
    4. # Install apt-get Requirements #
    5. ################################
    6.  
    7. # General
    8. RUN apt-get update && \
    9. apt-get install -y python3-pip cmake zlib1g-dev python3-tk python-opencv \
    10. # Boost libraries
    11. libboost-all-dev \
    12. # Scipy requirements
    13. libblas-dev liblapack-dev libatlas-base-dev gfortran \
    14. # Pygame requirements
    15. libsdl-dev libsdl-image1.2-dev libsdl-mixer1.2-dev libsdl-ttf2.0-dev \
    16. libsmpeg-dev libportmidi-dev libavformat-dev libswscale-dev \
    17. # Dashboard
    18. dpkg-dev build-essential python3.5-dev libjpeg-dev libtiff-dev libsdl1.2-dev libnotify-dev \
    19. freeglut3 freeglut3-dev libsm-dev libgtk2.0-dev libgtk-3-dev libwebkitgtk-dev libgtk-3-dev \
    20. libwebkitgtk-3.0-dev libgstreamer-plugins-base1.0-dev \
    21. # Gym
    22. # Mujoco_py
    23. curl libgl1-mesa-dev libgl1-mesa-glx libglew-dev libosmesa6-dev software-properties-common \
    24. # ViZDoom
    25. build-essential zlib1g-dev libsdl2-dev libjpeg-dev \
    26. nasm tar libbz2-dev libgtk2.0-dev cmake git libfluidsynth-dev libgme-dev \
    27. libopenal-dev timidity libwildmidi-dev unzip wget && \
    28. apt-get clean autoclean && \
    29. apt-get autoremove -y
    30.  
    31. ############################
    32. # Install Pip Requirements #
    33. ############################
    34. RUN pip3 install --upgrade pip
    35. RUN pip3 install setuptools==39.1.0 && pip3 install pytest && pip3 install pytest-xdist
    36.  
    37. RUN curl -o /usr/local/bin/patchelf https://s3-us-west-2.amazonaws.com/openai-sci-artifacts/manual-builds/patchelf_0.9_amd64.elf \
    38. && chmod +x /usr/local/bin/patchelf

    A sample docker file for the gym environment would look like this:

    1. FROM coach-base:master as builder
    2.  
    3. # prep gym and any of its related requirements.
    4. RUN pip3 install gym[atari,box2d,classic_control]==0.10.5
    5.  
    6. # add coach source starting with files that could trigger
    7. # re-build if dependencies change.
    8. RUN mkdir /root/src
    9. COPY setup.py /root/src/.
    10. COPY requirements.txt /root/src/.
    11. RUN pip3 install -r /root/src/requirements.txt
    12.  
    13. FROM coach-base:master
    14. WORKDIR /root/src
    15. COPY --from=builder /root/.cache /root/.cache
    16. COPY setup.py /root/src/.
    17. COPY requirements.txt /root/src/.
    18. COPY README.md /root/src/.
    19. COPY . /root/src

    A sample docker file for the Mujoco environment would look like this:

    1. FROM coach-base:master as builder
    2.  
    3. # prep vizdoom and any of its related requirements.
    4. RUN pip3 install vizdoom
    5.  
    6. # add coach source starting with files that could trigger
    7. # re-build if dependencies change.
    8. RUN mkdir /root/src
    9. COPY setup.py /root/src/.
    10. COPY requirements.txt /root/src/.
    11. RUN pip3 install -r /root/src/requirements.txt
    12.  
    13. FROM coach-base:master
    14. WORKDIR /root/src
    15. COPY --from=builder /root/.cache /root/.cache
    16. COPY setup.py /root/src/.
    17. COPY requirements.txt /root/src/.
    18. COPY README.md /root/src/.
    19. RUN pip3 install vizdoom && pip3 install -e .[all] && rm -rf /root/.cache
    20. COPY . /root/src

    Build the base container. Make sure you are in the Coach root directory before building.

    1. $ docker build -t coach-base:master -f docker/Dockerfile.base .

    If you would like to use the Mujoco environment, save this key as an environment variable. Replace <mujoco_key> with thecontents of your mujoco key.

    1. $ export MUJOCO_KEY=<mujoco_key>

    Build the container for your environment.Replace <env> with your choice of environment. The choices are gym, mujoco and doom.Replace <user-name>, <image-name> and <tag> with appropriate values.

    Push the container to a registry of your choice. Replace <user-name>, <image-name> and <tag> with appropriate values.

    1. $ docker push <user-name>/<image-name>:<tag>

    Add the following contents to file.Replace <user-name>, <image-name>, <tag>, <bucket-name> and <path-to-aws-credentials> with appropriate values.

    1. [coach]
    2. image = <user-name>/<image-name>:<tag>
    3. memory_backend = redispubsub
    4. data_store = s3
    5. s3_end_point = s3.amazonaws.com
    6. s3_bucket_name = <bucket-name>
    7. s3_creds_file = <path-to-aws-credentials>

    Run Distributed Coach

    The following command will run distributed Coach with CartPole_ClippedPPO preset, Redis Pub/Sub as the memory backend, S3 as the data store in Kuberneteswith three rollout workers.

    1. $ python3 rl_coach/coach.py -p CartPole_ClippedPPO \
    2. -dc \
    3. -e <experiment-name> \
    4. -n 3 \