Usage - Distributed Coach

Coach uses three interfaces to orchestrate, schedule and manager the resources of workers it spawns in the distributedmode. These interfaces are the orchestrator, memory backend and the data store. Refer to Distributed Coach - Horizontal Scale-Out formore information. The following implementation(s) are available for each interface:

Orchestrator - .
Memory Backend - Redis Pub/Sub.
Data Store - and NFS.

Prerequisites

Building and pushing containers - Docker.
Container registry access for hosting container images -
Using S3 for storing policy checkpoints - AWS CLI _,`AWS credentialsand .

Build Container Image and Push

Create a directory docker.

$ mkdir docker

Create docker files in the docker directory.

A sample base docker file (Dockerfile.base) would look like this:

FROM nvidia/cuda:9.0-cudnn7-runtime-ubuntu16.04
 
################################
# Install apt-get Requirements #
################################
 
# General
RUN apt-get update && \
    apt-get install -y python3-pip cmake zlib1g-dev python3-tk python-opencv \
    # Boost libraries
    libboost-all-dev \
    # Scipy requirements
    libblas-dev liblapack-dev libatlas-base-dev gfortran \
    # Pygame requirements
    libsdl-dev libsdl-image1.2-dev libsdl-mixer1.2-dev libsdl-ttf2.0-dev \
    libsmpeg-dev libportmidi-dev libavformat-dev libswscale-dev \
    # Dashboard
    dpkg-dev build-essential python3.5-dev libjpeg-dev  libtiff-dev libsdl1.2-dev libnotify-dev \
    freeglut3 freeglut3-dev libsm-dev libgtk2.0-dev libgtk-3-dev libwebkitgtk-dev libgtk-3-dev \
    libwebkitgtk-3.0-dev libgstreamer-plugins-base1.0-dev \
    # Gym
    # Mujoco_py
    curl libgl1-mesa-dev libgl1-mesa-glx libglew-dev libosmesa6-dev software-properties-common \
    # ViZDoom
    build-essential zlib1g-dev libsdl2-dev libjpeg-dev \
    nasm tar libbz2-dev libgtk2.0-dev cmake git libfluidsynth-dev libgme-dev \
    libopenal-dev timidity libwildmidi-dev unzip wget && \
    apt-get clean autoclean && \
    apt-get autoremove -y
 
############################
# Install Pip Requirements #
############################
RUN pip3 install --upgrade pip
RUN pip3 install setuptools==39.1.0 && pip3 install pytest && pip3 install pytest-xdist
 
RUN curl -o /usr/local/bin/patchelf https://s3-us-west-2.amazonaws.com/openai-sci-artifacts/manual-builds/patchelf_0.9_amd64.elf \
    && chmod +x /usr/local/bin/patchelf

A sample docker file for the gym environment would look like this:

FROM coach-base:master as builder
 
# prep gym and any of its related requirements.
RUN pip3 install gym[atari,box2d,classic_control]==0.10.5
 
# add coach source starting with files that could trigger
# re-build if dependencies change.
RUN mkdir /root/src
COPY setup.py /root/src/.
COPY requirements.txt /root/src/.
RUN pip3 install -r /root/src/requirements.txt
 
FROM coach-base:master
WORKDIR /root/src
COPY --from=builder /root/.cache /root/.cache
COPY setup.py /root/src/.
COPY requirements.txt /root/src/.
COPY README.md /root/src/.
COPY . /root/src

A sample docker file for the Mujoco environment would look like this:

FROM coach-base:master as builder
 
# prep vizdoom and any of its related requirements.
RUN pip3 install vizdoom
 
# add coach source starting with files that could trigger
# re-build if dependencies change.
RUN mkdir /root/src
COPY setup.py /root/src/.
COPY requirements.txt /root/src/.
RUN pip3 install -r /root/src/requirements.txt
 
FROM coach-base:master
WORKDIR /root/src
COPY --from=builder /root/.cache /root/.cache
COPY setup.py /root/src/.
COPY requirements.txt /root/src/.
COPY README.md /root/src/.
RUN pip3 install vizdoom && pip3 install -e .[all] && rm -rf /root/.cache
COPY . /root/src

Build the base container. Make sure you are in the Coach root directory before building.

$ docker build -t coach-base:master -f docker/Dockerfile.base .

If you would like to use the Mujoco environment, save this key as an environment variable. Replace <mujoco_key> with thecontents of your mujoco key.

$ export MUJOCO_KEY=<mujoco_key>

Build the container for your environment.Replace <env> with your choice of environment. The choices are gym, mujoco and doom.Replace <user-name>, <image-name> and <tag> with appropriate values.

Push the container to a registry of your choice. Replace <user-name>, <image-name> and <tag> with appropriate values.

$ docker push <user-name>/<image-name>:<tag>

Add the following contents to file.Replace <user-name>, <image-name>, <tag>, <bucket-name> and <path-to-aws-credentials> with appropriate values.

[coach]
image = <user-name>/<image-name>:<tag>
memory_backend = redispubsub
data_store = s3
s3_end_point = s3.amazonaws.com
s3_bucket_name = <bucket-name>
s3_creds_file = <path-to-aws-credentials>

Run Distributed Coach

The following command will run distributed Coach with CartPole_ClippedPPO preset, Redis Pub/Sub as the memory backend, S3 as the data store in Kuberneteswith three rollout workers.

$ python3 rl_coach/coach.py -p CartPole_ClippedPPO \
-dc \
-e <experiment-name> \
-n 3 \