MULTI LOAD
‘MULTI LOAD’can support users to import multiple tables at the same time on the basis of’MINI LOAD’. The specific commands are shown above. ‘/api/{db}/_multi_start’ starts a multi-table import task ‘/api/{db}/{table}/_load’ adds a table to be imported to an import task. The main difference from ‘MINI LOAD’ is that the ‘sub_label’ parameter needs to be passed in. ‘/api/{db}/_multi_commit’ submits the entire multi-table import task and the background begins processing ‘/api/{db}/_multi_abort’ Abandons a multi-table import task ‘/api/{db}/_multi_desc’ shows the number of jobs submitted by a multi-table import task
HTTP Protocol Specification Privilege Authentication Currently Doris uses the Basic mode of HTTP for privilege authentication. So you need to specify a username and password when importing This way is to pass passwords in plaintext, since we are all in the Intranet environment at present…
Expect Doris needs to send an HTTP request, and needs the ‘Expect’ header information with the content of’100-continue’. Why? Because we need to redirect the request, we have to transfer the data content before. This can avoid causing multiple data transmission, thereby improving efficiency.
Content-Length Doris needs to send a request with the header ‘Content-Length’. If the content ratio is sent If’Content-Length’is less, Palo believes that if there is a transmission problem, the submission of the task fails. NOTE: If you send more data than ‘Content-Length’, Doris reads only ‘Content-Length’. Length content and import
Description of parameters: User: User is user_name if the user is in default_cluster. Otherwise, it is user_name@cluster_name.
Sub_label: Used to specify a subversion number within a multi-table import task. For multi-table imported loads, this parameter must be passed in.
Columns: Used to describe the corresponding column name in the import file. If it is not passed in, the column order in the file is considered to be the same as the order in which the table is built. The specified method is comma-separated, such as columns = k1, k2, k3, K4
Column_separator: Used to specify the separator between columns, default is’ t’ NOTE: Url encoding is required, such as specifying ‘\t’as a delimiter. Then you should pass in ‘column_separator=% 09’
Max_filter_ratio: Used to specify the maximum percentage allowed to filter irregular data, default is 0, not allowed to filter Custom specification should be as follows:’max_filter_ratio = 0.2’, meaning that 20% error rate is allowed. Pass in effect at’_multi_start’
NOTE:
Currently, it is not possible to submit multiple files in the form of {file1, file2}’, because curl splits them into multiple files. Request sent, multiple requests cannot share a label number, so it cannot be used
Supports streaming-like ways to use curl to import data into Doris, but Doris will have to wait until the streaming is over Real import behavior will occur, and the amount of data in this way cannot be too large.
‘35;’35; example
Multi-table Import Midway Abandon (User in default_cluster) curl —location-trusted -u root -XPOST http://host:port/api/testDb/\_multi\_start?label=123 curl —location-trusted -u root -T testData1 curl —location-trusted -u root -XPOST http://host:port/api/testDb/\_multi\_abort?label=123
keyword
MULTI, MINI, LOAD