Pulsar SQL configuration and deployment

    你可以在 属性文件中配置 Presto Pulsar 连接器。 连接器和默认值的配置如下。

    你可以通过多个主机连接 Presto 到 Pulsar 集群。 To configure multiple hosts for brokers, add multiple URLs to pulsar.web-service-url. 要为 ZooKeeper 配置多个主机, 添加多个 URI 到 pulsar.zookeeper-uri。 The following is an example.

    1. pulsar.web-service-url=http://localhost:8080,localhost:8081,localhost:8082
    2. pulsar.zookeeper-uri=localhost1,localhost2:2181

    If you already have a Presto cluster, you can copy the Presto Pulsar connector plugin to your existing cluster. Download the archived plugin package with the following command.

    1. $ wget https://archive.apache.org/dist/pulsar/pulsar-2.8.0/apache-pulsar-2.8.0-bin.tar.gz

    Since Pulsar SQL is powered by Trino (formerly Presto SQL), the configuration for deployment is the same for the Pulsar SQL worker.

    您可以使用相同的 CLI 参数作为 Presto 启动器。

    1. $ ./bin/pulsar sql-worker --help
    2. Usage: launcher [options] command
    3. Commands: run, start, stop, restart, kill, status
    4. Options:
    5. -h, --help show this help message and exit
    6. -v, --verbose Run verbosely
    7. --etc-dir=DIR Defaults to INSTALL_PATH/etc
    8. --launcher-config=FILE
    9. Defaults to INSTALL_PATH/bin/launcher.properties
    10. --node-config=FILE Defaults to ETC_DIR/node.properties
    11. --log-levels-file=FILE
    12. Defaults to ETC_DIR/log.properties
    13. --data-dir=DIR Defaults to INSTALL_PATH
    14. --pid-file=FILE Defaults to DATA_DIR/var/run/launcher.pid
    15. --launcher-log-file=FILE
    16. Defaults to DATA_DIR/var/log/launcher.log (only in
    17. daemon mode)
    18. --server-log-file=FILE
    19. Defaults to DATA_DIR/var/log/server.log (only in
    20. daemon mode)
    21. -D NAME=VALUE Set a Java system property

    The default configuration for the cluster is located in ${project.root}/conf/presto. You can customize your deployment by modifying the default configuration.

    你可以设置该工作器从不同的配置目录读取,或者设置不同的目录来写入数据。

    你可以作为守护进程开始工作者。

    1. $ ./bin/pulsar sql-worker start
    1. 复制 Pulsar 二进制文件并分布到三个节点。

    The first node runs as Presto coordinator. The minimal configuration requirement in the ${project.root}/conf/presto/config.properties file is as follows.

    1. coordinator=true
    2. node-scheduler.include-coordinator=true
    3. http-server.http.port=8080
    4. query.max-memory=50GB
    5. discovery.uri=<coordinator-url>

    另两个节点作为 worker 节点,可以使用下面的配置:

    1. coordinator=false
    2. http-server.http.port=8080
    3. query.max-memory=50GB
    4. query.max-memory-per-node=1GB
    5. discovery.uri=<coordinator-url>
    1. Modify pulsar.web-service-url and pulsar.zookeeper-uri configuration in the ${project.root}/conf/presto/catalog/pulsar.properties file accordingly for the three nodes.

    2. 启动 Coordinator 节点。

    1. 启动 worker 节点。
    1. $ ./bin/pulsar sql-worker run
    1. 启动 SQL CLI 并检查集群的状态。
    1. $ ./bin/pulsar sql --server <coordinate_url>
    1. 检查节点的状态。
    1. presto> SELECT * FROM system.runtime.nodes;
    2. node_id | http_uri | node_version | coordinator | state
    3. ---------+-------------------------+--------------+-------------+--------
    4. 1 | http://192.168.2.1:8081 | testversion | true | active
    5. 3 | http://192.168.2.2:8081 | testversion | false | active
    6. 2 | http://192.168.2.3:8081 | testversion | false | active

    Note
    The broker does not advance LAC, so when Pulsar SQL bypass broker to query data, it can only read entries up to the LAC that all the bookies learned. You can enable periodically write LAC on the broker by setting “bookkeeperExplicitLacIntervalInMills” in the broker.conf.