Tablet Restore Tool
- : a data root directory corresponding to the BE node.
trash
: The directory of the recycle bin.time_label
: Time label, for the uniqueness of the data directory in the recycle bin, while recording the data time, use the time label as a subdirectory.
When a user finds that online data has been deleted by mistake, he needs to recover the deleted tablet from the recycle bin. This tablet data recovery function is needed.
BE provides http interface and restore_tablet_tool.sh
script to achieve this function, and supports single tablet operation (single mode) and batch operation mode (batch mode).
- In batch mode, support batch tablet data recovery.
single mode
http request method
BE provides an http interface for single tablet data recovery, the interface is as follows:
curl -X POST "http://be_host:be_webserver_port/api/restore_tablet?tablet_id=11111\&schema_hash=12345"
The successful results are as follows:
{"status": "Success", "msg": "OK"}
If it fails, the corresponding failure reason will be returned. One possible result is as follows:
-
can be used to realize the function of single tablet data recovery.
sh tools/restore_tablet_tool.sh -b "http://127.0.0.1:8040" -t 12345 -s 11111
sh tools/restore_tablet_tool.sh --backend "http://127.0.0.1:8040" --tablet_id 12345 --schema_hash 11111
batch mode
The batch recovery mode is used to realize the function of recovering multiple tablet data.
When using, you need to put the restored tablet id and schema hash in a file in a comma-separated format in advance, one tablet per line.
The format is as follows:
12345,11111
12347,11111
Then perform the recovery with the following command (assuming the file name is: ):
Repair missing or damaged Tablet
In some very special circumstances, such as code bugs, or human misoperation, etc., all replicas of some tablets may be lost. In this case, the data has been substantially lost. However, in some scenarios, the business still hopes to ensure that the query will not report errors even if there is data loss, and reduce the perception of the user layer. At this point, we can use the blank Tablet to fill the missing replica to ensure that the query can be executed normally.
View Master FE log
fe.log
If there is data loss, there will be a log similar to the following in the log:
backend [10001] invalid situation. tablet[20000] has few replica[1], replica num setting is [3]
This log indicates that all replicas of tablet 20000 have been damaged or lost.
Use blank replicas to fill in missing copies
After confirming that the data cannot be recovered, you can execute the following command to generate blank replicas.
ADMIN SET FRONTEND CONFIG ("recover_with_empty_tablet" = "true");
- Note: You can first check whether the current version supports this parameter through the
ADMIN SHOW FRONTEND CONFIG;
command.
A few minutes after the setup is complete, you should see the following log in the Master FE log
fe.log
:Judge whether it has been repaired successfully through query.