Execution Mode

    Prior to release-1.15, there is the only execution mode called execution mode. The PROCESS mode means that the Python user-defined functions will be executed in separate Python processes.

    In release-1.15, it has introduced a new execution mode called THREAD execution mode. The THREAD mode means that the Python user-defined functions will be executed in the same process as Java Operator, It should be noted that multiple Python user-defined functions running in the same JVM are still affected by GIL.

    The purpose of the introduction of THREAD mode is to overcome the overhead of serialization/deserialization and network communication caused in PROCESS mode. So if performance is not your concern, or the computing logic of your customized Python functions is the performance bottleneck of the job, PROCESS mode will be the best choice as mode provides the best isolation compared to THREAD mode.

    • THREAD: The Python user-defined functions will be executed in the same process as Java operator.

    You could specify the Python execution mode using Python Table API as following:

    PROCESS Execution Mode

    In PROCESS execution mode, the Python user-defined functions will be executed in separate Python Worker process. The Java operator process communicates with the Python worker process using various Grpc services.

    THREAD Execution Mode

    In THREAD execution mode, the Python user-defined functions will be executed in the same process as Java operators. PyFlink takes use of third part library PEMJA to embed Python in Java Application.