Experiments

    Everything must be cleared away between runs to make sure stuff doesn't bleed across.

    To run throughput/ latency experiments you'll need to set up the client machine with (on the machine itself):

    Billing Estimates

    To get resource measurements from the hosts running experiments we first need an inventory file atansible/inventory/billing.yml, something like:

    1. [all]
    2. myhost1
    3. myhost2
    4. ...
    1. cd ansible
    2. ansible-playbook -i inventory/billing.yml billing_setup.yml

    Data should be generated and uploaded ahead of time.

    For details of the SGD experiment data see notes.

    The matrix experiment data needs to be generated in bulk locally, uploaded to S3 then downloaded on the client machine (or directly copied with scp). You must have the native tooling and pyfaasm installed to generate it up front (butthis doesn't need to be done if it's already in S3):

    1. inv data.tf-upload data.tf-state

    SGD Experiment

    1. # -- Prepare --
    2. # Upload data (one off)
    3. inv data.reuters-state
    4.  
    5. # -- Build/ upload --
    6. inv knative.build-native sgd reuters_svm
    7. inv upload sgd reuters_svm
    8.  
    9. # -- Deploy --
    10.  
    11. export N_WORKERS=10
    12.  
    13. # Native containers
    14. inv knative.deploy-native sgd reuters_svm $N_WORKERS
    15.  
    16. # Wasm
    17. inv knative.deploy $N_WORKERS
    18.  
    19. # -- Wait --
    20.  
    21. watch kn -n faasm service list
    22. watch kubectl -n faasm get pods
    23.  
    24. # -- Run experiment --
    25.  
    26. # Native SGD
    27. inv experiments.sgd --native $N_WORKERS 60000
    28.  
    29. # Wasm SGD
    30. inv experiments.sgd $N_WORKERS 60000
    31.  
    32. # -- Clean up --
    33.  
    34. # Native SGD
    35. inv knative.delete-native sgd reuters_svm
    36.  
    37. # Wasm

    Tensorflow Experiment

    You need to set the following environment variables for these experiments (through the knative config):

    • COLD_START_DELAY_MS=800
    • SGD_CODEGEN=off
    • PYTHON_CODEGEN=off

    Preamble:

    1. # -- Build/ upload --
    2. inv knative.build-native tf image
    3. inv upload tf image
    4.  
    5. # -- Upload data (one-off)
    6. inv data.tf-upload data.tf-state

    Latency:

    1. # -- Deploy both (note small number of workers) --
    2. inv knative.deploy-native tf image 1
    3. inv knative.deploy 1
    4.  
    5. # -- Run experiment --
    6. inv experiments.tf-lat

    Once you've done several runs, you need to pull the results to your local machine and process:

    1. # SGD
    2. inv experiments.sgd-pull-results <user> <host>
    3.  
    4. # Matrices
    5. inv experiments.matrix-pull-results <user> <host>
    6.  
    7. # Inference latency
    8. inv experiments.tf-lat-pull-results <user> <host>
    9.  
    10. # Inference throughput