User Defined Function Rpc

    1. The advantage

      • Cross-language: UDF services can be written in all languages supported by Protobuf.
      • Security: UDF execution failure or crash only affects the UDF Service and does not cause the Doris process to crash.
      • Flexibility: Any other Service or library class can be invoked within a UDF Service to meet a wider variety of business requirements.

    This section describes how to develop a Remote RPC Service. Samples for the Java version are provided under for your reference.

    • function_service.proto
      • PFunctionCallRequest
        • function_name:The function name, corresponding to the symbol specified when the function was created
        • args:The parameters passed by the method
        • context:Querying context Information
      • PFunctionCallResponse
        • result:Return result
        • status:Return Status, 0 indicates normal
      • PCheckFunctionRequest
        • function:Function related information
      • PCheckFunctionResponse
        • status:Return status, 0 indicates normal

    Use protoc generate code, and specific parameters are viewed using protoc -h

    The following three methods need to be implemented

    • fnCall:Used to write computational logic
    • checkFn:Used to verify function names, parameters, and return values when creating UDFs
    • handShake:Used for interface probe

    Currently UDTF are not supported

    1. PROPERTIES中symbolRepresents the name of the method passed by the RPC call, which must be set。
    2. PROPERTIES中object_fileRepresents the RPC service address. Currently, a single address and a cluster address in BRPC-compatible format are supported. Refer to the cluster connection mode。
    3. PROPERTIES中typeIndicates the UDF call type, which is Native by default. Rpc is transmitted when Rpc UDF is used。
    4. name: A function belongs to a DB and name is of the formdbName.funcName. When is not explicitly specified, the db of the current session is useddbName

    Sample:

    1. CREATE FUNCTION rpc_add(INT, INT) RETURNS INT PROPERTIES (
    2. "SYMBOL"="add_int",
    3. "OBJECT_FILE"="127.0.0.1:9090",

    Users must have the SELECT permission of the corresponding database to use UDF/UDAF.

    The use of UDF is consistent with ordinary function methods. The only difference is that the scope of built-in functions is global, and the scope of UDF is internal to DB. When the link session is inside the data, directly using the UDF name will find the corresponding UDF inside the current DB. Otherwise, the user needs to display the specified UDF database name, such as dbName.funcName.

    Examples of rpc server implementations and cpp/java/python languages are provided in the samples/doris-demo/ directory. See the in each directory for details on how to use it.