Getting Started with PySpark

    Before adding MLeap Pyspark to your project, you first have to compile andadd MLeap Spark.

    MLeap PySpark is available in the github repository in thepython package.

    Then in your python environment do:

    1. import mleap.pyspark

    Note: the import of needs to happen before any other PySparklibraries are imported.

    Using PIP

    Alternatively, there is PIP support for PySpark available under: .

    To use MLeap extensions to PySpark:

    1. See build instructions to build MLeap from source.
    2. See for an overview of ML pipelines.
    3. See Demo notebook on how to use PySpark and MLeap to serialize your pipeline to Bundle.ml