load balancing

    Retry and load balance yourself in the application layer code. For example, if a connection is found to be down, it will automatically retry on other connections. Application layer code retry requires the application to configure multiple doris front-end node addresses.

    If you use mysql jdbc connector to connect to Doris, you can use jdbc’s automatic retry mechanism:

    For details, please refer to Mysql official website document

    ProxySQL is a flexible and powerful MySQL proxy layer. It is a MySQL middleware that can be actually used in a production environment. It can realize read-write separation, support Query routing function, support dynamic designation of a certain SQL for cache, support dynamic loading configuration, failure Switching and some SQL filtering functions.

    Doris’s FE process is responsible for receiving user connections and query requests. It itself is horizontally scalable and highly available, but it requires users to set up a proxy on multiple FEs to achieve automatic connection load balancing.

    1. # vim /etc/yum.repos.d/proxysql.repo
    2. [proxysql_repo]
    3. name= ProxySQL YUM repository
    4. baseurl=http://repo.proxysql.com/ProxySQL/proxysql-1.4.x/centos/\$releasever
    5. gpgcheck=1
    6. gpgkey=http://repo.proxysql.com/ProxySQL/repo_pub_key
    7. Perform installation
    8. # yum clean all
    9. # yum makecache
    10. # yum -y install proxysql
    11. View version
    12. # proxysql --version
    13. ProxySQL version 1.4.13-15-g69d4207, codename Truls
    14. Set up auto start
    15. # systemctl enable proxysql
    16. # systemctl start proxysql
    17. # systemctl status proxysql
    18. After startup, it will listen to two ports, the default is 6032 and 6033. Port 6032 is the management port of ProxySQL, and 6033 is the port for ProxySQL to provide external services (that is, the forwarding port connected to the real database of the forwarding backend).
    19. # netstat -tunlp
    20. Active Internet connections (only servers)
    21. Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
    22. tcp 0 0 0.0.0.0:6032 0.0.0.0:* LISTEN 23940/proxysql
    23. tcp 0 0 0.0.0.0:6033 0.0.0.0:* LISTEN

    ProxySQL Config

    ProxySQL has a configuration file /etc/proxysql.cnf and a configuration database file /var/lib/proxysql/proxysql.db. Special attention is needed here: If there is a "proxysql.db" file (under the /var/lib/proxysql directory), the ProxySQL service will only be read when it is started for the first time The proxysql.cnf file and parse it; after startup, the proxysql.cnf file will not be read! If you want the configuration in the proxysql.cnf file to take effect after restarting the proxysql service (that is, you want proxysql to read and parse the proxysql.cnf configuration file when it restarts), you need to delete /var/lib/proxysql/proxysql first. dbdatabase file, and then restart the proxysql service. This is equivalent to initializing the proxysql service, and a pure proxysql.db database file will be produced again (if proxysql related routing rules, etc. are configured before, it will be erased)

    View and modify configuration files

    1. # egrep -v "^#|^$" /etc/proxysql.cnf
    2. datadir="/var/lib/proxysql" #data dir
    3. admin_variables=
    4. {
    5. admin_credentials="admin:admin" #User name and password for connecting to the management terminal
    6. mysql_ifaces="0.0.0.0:6032" #Management port, used to connect to proxysql management database
    7. }
    8. mysql_variables=
    9. {
    10. threads=4 #Specify the number of threads opened for the forwarding port
    11. max_connections=2048
    12. default_query_delay=0
    13. default_query_timeout=36000000
    14. have_compress=true
    15. poll_timeout=2000
    16. interfaces="0.0.0.0:6033" #Specify the forwarding port, used to connect to the back-end mysql database, which is equivalent to acting as a proxy
    17. default_schema="information_schema"
    18. stacksize=1048576
    19. server_version="5.5.30" #Specify the version of the backend mysql
    20. connect_timeout_server=3000
    21. monitor_username="monitor"
    22. monitor_password="monitor"
    23. monitor_history=600000
    24. monitor_connect_interval=60000
    25. monitor_ping_interval=10000
    26. monitor_read_only_interval=1500
    27. monitor_read_only_timeout=500
    28. ping_interval_server_msec=120000
    29. ping_timeout_server=500
    30. commands_stats=true
    31. sessions_sort=true
    32. connect_retries_on_failure=10
    33. }
    34. mysql_servers =
    35. (
    36. )
    37. mysql_users:
    38. (
    39. )
    40. mysql_query_rules:
    41. (
    42. )
    43. scheduler=
    44. (
    45. )
    46. mysql_replication_hostgroups=
    47. (
    48. )

    Connect to the ProxySQL management port test

    1. # mysql -uadmin -padmin -P6032 -hdoris01
    2. View the global_variables table information of the main library (it is in this library after login by default)
    3. MySQL [(none)]> show databases;
    4. +-----+---------------+-------------------------------------+
    5. | seq | name | file |
    6. +-----+---------------+-------------------------------------+
    7. | 0 | main | |
    8. | 2 | disk | /var/lib/proxysql/proxysql.db |
    9. | 3 | stats | |
    10. | 4 | monitor | |
    11. | 5 | stats_history | /var/lib/proxysql/proxysql_stats.db |
    12. +-----+---------------+-------------------------------------+
    13. 5 rows in set (0.000 sec)
    14. MySQL [(none)]> use main;
    15. Reading table information for completion of table and column names
    16. You can turn off this feature to get a quicker startup with -A
    17. MySQL [main]> show tables;
    18. +--------------------------------------------+
    19. | tables |
    20. +--------------------------------------------+
    21. | global_variables |
    22. | mysql_collations |
    23. | mysql_group_replication_hostgroups |
    24. | mysql_query_rules |
    25. | mysql_replication_hostgroups |
    26. | mysql_servers |
    27. | mysql_users |
    28. | proxysql_servers |
    29. | runtime_checksums_values |
    30. | runtime_global_variables |
    31. | runtime_mysql_group_replication_hostgroups |
    32. | runtime_mysql_query_rules |
    33. | runtime_mysql_query_rules_fast_routing |
    34. | runtime_mysql_replication_hostgroups |
    35. | runtime_mysql_servers |
    36. | runtime_mysql_users |
    37. | runtime_proxysql_servers |
    38. | runtime_scheduler |
    39. | scheduler |
    40. +--------------------------------------------+
    41. 20 rows in set (0.000 sec)

    ProxySQL configuration backend Doris FE

    Use the insert statement to add the host to the mysql_servers table, where: hostgroup_id is 10 for the write group, and 20 for the read group. We don’t need to read and write the license here, and it doesn’t matter which one can be set randomly.

    1. [root@mysql-proxy ~]# mysql -uadmin -padmin -P6032 -h127.0.0.1
    2. ............
    3. MySQL [(none)]> insert into mysql_servers(hostgroup_id,hostname,port) values(10,'192.168.9.211',9030);
    4. Query OK, 1 row affected (0.000 sec)
    5. MySQL [(none)]> insert into mysql_servers(hostgroup_id,hostname,port) values(10,'192.168.9.212',9030);
    6. Query OK, 1 row affected (0.000 sec)
    7. MySQL [(none)]> insert into mysql_servers(hostgroup_id,hostname,port) values(10,'192.168.9.213',9030);
    8. Query OK, 1 row affected (0.000 sec)
    9. If an error occurs during the insertion process:
    10. ERROR 1045 (#2800): UNIQUE constraint failed: mysql_servers.hostgroup_id, mysql_servers.hostname, mysql_servers.port
    11. It means that other configurations may have been defined before, you can clear this table or delete the configuration of the corresponding host
    12. MySQL [(none)]> select * from mysql_servers;
    13. MySQL [(none)]> delete from mysql_servers;
    14. Query OK, 6 rows affected (0.000 sec)
    15. Check whether these 3 nodes are inserted successfully and their status.
    16. MySQL [(none)]> select * from mysql_servers\G;
    17. *************************** 1. row ***************************
    18. hostgroup_id: 10
    19. hostname: 192.168.9.211
    20. port: 9030
    21. status: ONLINE
    22. weight: 1
    23. compression: 0
    24. max_connections: 1000
    25. max_replication_lag: 0
    26. use_ssl: 0
    27. max_latency_ms: 0
    28. comment:
    29. *************************** 2. row ***************************
    30. hostgroup_id: 10
    31. hostname: 192.168.9.212
    32. port: 9030
    33. status: ONLINE
    34. weight: 1
    35. compression: 0
    36. max_connections: 1000
    37. max_replication_lag: 0
    38. use_ssl: 0
    39. max_latency_ms: 0
    40. comment:
    41. *************************** 3. row ***************************
    42. hostgroup_id: 10
    43. hostname: 192.168.9.213
    44. port: 9030
    45. status: ONLINE
    46. weight: 1
    47. compression: 0
    48. max_connections: 1000
    49. max_replication_lag: 0
    50. use_ssl: 0
    51. max_latency_ms: 0
    52. comment:
    53. 6 rows in set (0.000 sec)
    54. ERROR: No query specified
    55. After the above modification, load it to RUNTIME and save it to disk. The following two steps are very important, otherwise your configuration information will be gone after you exit and must be saved
    56. MySQL [(none)]> load mysql servers to runtime;
    57. Query OK, 0 rows affected (0.006 sec)
    58. MySQL [(none)]> save mysql servers to disk;
    59. Query OK, 0 rows affected (0.348 sec)

    Monitor Doris FE node configuration

    After adding doris fe nodes, you also need to monitor these back-end nodes. For multiple FE high-availability load balancing environments on the backend, this is necessary because ProxySQL needs to be automatically adjusted by the read_only value of each node

    Whether they belong to the read group or the write group.

    First create a user name for monitoring on the back-end master main data node

    Configure Doris users

    All the above configurations are about the back-end Doris FE node. Now you can configure the SQL statements, including: the user who sends the SQL statement, the routing rules of the SQL statement, the cache of the SQL query, the rewriting of the SQL statement, and so on.

    This section is the user configuration used by the SQL request, such as the root user. This requires that we need to add relevant users to the back-end Doris FE node first. Here are examples of two user names root and doris.

    1. First, execute on the Doris FE master master database node:
    2. # mysql -P9030 -uroot -p
    3. .........
    4. mysql> create user doris@'%' identified by 'P@ssword1!';
    5. Query OK, 0 rows affected, 1 warning (0.04 sec)
    6. mysql> grant ADMIN_PRIV on *.* to doris@'%';
    7. Query OK, 0 rows affected, 1 warning (0.03 sec)
    8. Then go back to the mysql-proxy proxy layer node, configure the mysql_users table, and add the two users just now to the table.
    9. admin> insert into mysql_users(username,password,default_hostgroup) values('root','',10);
    10. Query OK, 1 row affected (0.001 sec)
    11. admin> insert into mysql_users(username,password,default_hostgroup) values('doris','P@ssword1!',10);
    12. admin> load mysql users to runtime;
    13. Query OK, 0 rows affected (0.001 sec)
    14. Query OK, 0 rows affected (0.108 sec)
    15. The mysql_users table has many fields. The three main fields are username, password, and default_hostgroup:
    16. -username: The username used by the front-end to connect to ProxySQL and ProxySQL to route SQL statements to MySQL.
    17. -password: the password corresponding to the user name. It can be a plain text password or a hash password. If you want to use the hash password, you can execute it on a MySQL node first select password(PASSWORD), and then copy the encryption result to this field.
    18. -default_hostgroup: The default routing destination of the username. For example, when the field value of the specified root user is 10, the SQL statement sent by the root user is used by default
    19. In this case, it will be routed to a node in the hostgroup_id=10 group.
    20. admin> select * from mysql_users\G
    21. *************************** 1. row ***************************
    22. username: root
    23. password:
    24. active: 1
    25. use_ssl: 0
    26. default_hostgroup: 10
    27. default_schema: NULL
    28. schema_locked: 0
    29. transaction_persistent: 1
    30. fast_forward: 0
    31. backend: 1
    32. frontend: 1
    33. max_connections: 10000
    34. *************************** 2. row ***************************
    35. username: doris
    36. password: P@ssword1!
    37. active: 1
    38. use_ssl: 0
    39. default_hostgroup: 10
    40. default_schema: NULL
    41. schema_locked: 0
    42. transaction_persistent: 1
    43. fast_forward: 0
    44. backend: 1
    45. frontend: 1
    46. max_connections: 10000
    47. 2 rows in set (0.000 sec)
    48. Although the mysql_users table is not described in detail here, only users with active=1 are valid users, and the default active is 1.
    49. MySQL [(none)]> load mysql users to runtime;
    50. Query OK, 0 rows affected (0.001 sec)
    51. MySQL [(none)]> save mysql users to disk;
    52. Query OK, 0 rows affected (0.123 sec)
    53. In this way, you can use the doris username and password to connect to ProxySQL through the sql client

    Connect to Doris through ProxySQL for testing

    1. #mysql -uroot -p -P6033 -hdoris01 -e "show databases;"
    2. Enter password:
    3. ERROR 9001 (HY000) at line 1: Max connect timeout reached while reaching hostgroup 10 after 10000ms
    4. At this time, an error was found, and it was not forwarded to the real doris fe on the backend.
    5. Through the log, you can see that there is set autocommit=0 to open the transaction
    6. Check the configuration found:
    7. mysql-forward_autocommit=false
    8. mysql-autocommit_false_is_transaction=false
    9. We dont need to read and write separation here, just turn these two parameters into true directly through the following statement.
    10. mysql> UPDATE global_variables SET variable_value='true' WHERE variable_name='mysql-forward_autocommit';
    11. Query OK, 1 row affected (0.00 sec)
    12. mysql> UPDATE global_variables SET variable_value='true' WHERE variable_name='mysql-autocommit_false_is_transaction';
    13. Query OK, 1 row affected (0.01 sec)
    14. mysql> LOAD MYSQL VARIABLES TO RUNTIME;
    15. Query OK, 0 rows affected (0.00 sec)
    16. mysql> SAVE MYSQL VARIABLES TO DISK;
    17. Query OK, 98 rows affected (0.12 sec)
    18. Then we try again and it shows success
    19. [root@doris01 ~]# mysql -udoris -pP@ssword1! -P6033 -h192.168.9.211 -e "show databases;"
    20. Warning: Using a password on the command line interface can be insecure.
    21. +--------------------+
    22. | Database |
    23. +--------------------+
    24. | doris_audit_db |
    25. | information_schema |
    26. | retail |
    27. +--------------------+

    OK, that’s the end, you can use Mysql client, JDBC, etc. to connect to ProxySQL to operate your doris.

    Overview

    Nginx can implement load balancing of HTTP and HTTPS protocols, as well as load balancing of TCP protocol. So, the question is, can the load balancing of the Apache Doris database be achieved through Nginx? The answer is: yes. Next, let’s discuss how to use Nginx to achieve load balancing of Apache Doris.

    Note: Using Nginx to achieve load balancing of Apache Doris database, the premise is to build an Apache Doris environment. The IP and port of Apache Doris FE are as follows. Here I use one FE to demonstrate, multiple FEs only You need to add multiple FE IP addresses and ports in the configuration

    The Apache Doris and port to access MySQL through Nginx are shown below.

    1. IP: 172.31.7.119
    2. 端口: 9030

    Install dependencies

    1. sudo apt-get install build-essential
    2. sudo apt-get install libpcre3 libpcre3-dev
    3. sudo apt-get install zlib1g-dev
    4. sudo apt-get install openssl libssl-dev

    Install Nginx

    Here is a new configuration file

    1. vim /usr/local/nginx/conf/default.conf

    Then add the following in it

    1. events {
    2. worker_connections 1024;
    3. }
    4. stream {
    5. upstream mysqld {
    6. hash $remote_addr consistent;
    7. server 172.31.7.119:9030 weight=1 max_fails=2 fail_timeout=60s;
    8. ##注意这里如果是多个FE,加载这里就行了
    9. }
    10. ###这里是配置代理的端口,超时时间等
    11. server {
    12. listen 6030;
    13. proxy_connect_timeout 300s;
    14. proxy_timeout 300s;
    15. proxy_pass mysqld;
    16. }
    17. }

    Start Nginx

    1. cd /usr/local/nginx

    verify

    Parameter explanation: -u specifies the Doris username -p specifies the Doris password, my password here is empty, so there is no -h specifies the Nginx proxy server IP-P specifies the port