Tablet Local Debug

    At this time, it is necessary to copy the copy data of the tablet online to the local environment for reproduction, and then locate the problem.

    The tablet id can be confirmed by the BE log, and then the information can be obtained by the following command (assuming the tablet id is 10020).

    Get information such as DbId/TableId/PartitionId where the tablet is located.

    Execute in the previous step to obtain information such as BackendId/SchemHash.

    1. mysql> SHOW PROC '/dbs/10004/10016/partitions/10015/10017/10020'\G
    2. *************************** 1. row ***************************
    3. ReplicaId: 10021
    4. BackendId: 10003
    5. Version: 3
    6. LstSuccessVersion: 3
    7. LstFailedVersion: -1
    8. LstFailedTime: NULL
    9. SchemaHash: 785778507
    10. LocalDataSize: 780
    11. RemoteDataSize: 0
    12. RowCount: 2
    13. State: NORMAL
    14. IsBad: false
    15. VersionCount: 3
    16. PathHash: 7390150550643804973
    17. MetaUrl: http://192.168.10.1:8040/api/meta/header/10020
    18. CompactionStatus: http://192.168.10.1:8040/api/compaction/show?tablet_id=10020

    Create tablet snapshot and get table creation statement

    1. mysql> admin copy tablet 10020 properties("backend_id" = "10003", "version" = "2")\G
    2. *************************** 1. row ***************************
    3. TabletId: 10020
    4. BackendId: 10003
    5. Ip: 192.168.10.1
    6. Path: /path/to/be/storage/snapshot/20220830101353.2.3600
    7. ExpirationMinutes: 60
    8. CreateTableStmt: CREATE TABLE `tbl1` (
    9. `k1` int(11) NULL,
    10. DUPLICATE KEY(`k1`, `k2`)
    11. DISTRIBUTED BY HASH(k1) BUCKETS 1
    12. PROPERTIES (
    13. "replication_num" = "1",
    14. "version_info" = "2"
    15. );

    The admin copy tablet command can generate a snapshot file of the corresponding replica and version for the specified tablet. Snapshot files are stored in the Path directory of the BE node indicated by the Ip field.

    There will be a directory named tablet id under this directory, which will be packaged as a whole for later use. (Note that the directory is kept for a maximum of 60 minutes, after which it is automatically deleted).

    The command will also generate the table creation statement corresponding to the tablet at the same time. Note that this table creation statement is not the original table creation statement, its bucket number and replica number are both 1, and the versionInfo field is specified. This table building statement is used later when loading the tablet locally.

    So far, we have obtained all the necessary information, the list is as follows:

    1. Packaged tablet data, such as 10020.tar.gz.
    2. Create a table statement.

    2. Load Tablet locally

    1. Deploy a single-node Doris cluster (1FE, 1BE) locally, and the deployment version is the same as the online cluster. If the online deployment version is DORIS-1.1.1, the local environment also deploys the DORIS-1.1.1 version.

    2. Create a table

      Create a table in the local environment using the create table statement from the previous step.

    3. Get the tablet information of the newly created table

      Because the number of buckets and replicas of the newly created table is 1, there will only be one tablet with one replica:

      1. mysql> show tablets from tbl1\G
      2. *************************** 1. row ***************************
      3. TabletId: 10017
      4. ReplicaId: 10018
      5. BackendId: 10003
      6. SchemaHash: 44622287
      7. Version: 1
      8. LstSuccessVersion: 1
      9. LstFailedVersion: -1
      10. LstFailedTime: NULL
      11. LocalDataSize: 0
      12. RemoteDataSize: 0
      13. RowCount: 0
      14. State: NORMAL
      15. LstConsistencyCheckTime: NULL
      16. CheckVersion: -1
      17. VersionCount: -1
      18. CompactionStatus: http://192.168.10.1:8040/api/compaction/show?tablet_id=10017
      1. mysql> show tablet 10017\G
      2. *************************** 1. row ***************************
      3. DbName: default_cluster:db1
      4. TableName: tbl1
      5. PartitionName: tbl1
      6. IndexName: tbl1
      7. DbId: 10004
      8. TableId: 10015
      9. PartitionId: 10014
      10. IndexId: 10016
      11. IsSync: true
      12. Order: 0
      13. DetailCmd: SHOW PROC '/dbs/10004/10015/partitions/10014/10016/10017';

      Here we will record the following information:

      • TableId
      • PartitionId
      • TabletId
      • SchemaHash

      At the same time, we also need to go to the data directory of the BE node in the debugging environment to confirm the shard id where the new tablet is located:

      This command will enter the directory where the tablet 10017 is located and display the path. Here we will see a path similar to the following:

      1. /path/to/storage/data/0/10017

      where 0 is the shard id.

    4. Unzip the tablet data package obtained in the first step. The editor opens the 10017.hdr.json file, and modifies the following fields to the information obtained in the previous step:

      1. "table_id":10015
      2. "partition_id":10014
      3. "tablet_id":10017
      4. "schema_hash":44622287
      5. "shard_id":0
    5. Load the tablet

      First, stop the debug environment’s BE process (./bin/stop_be.sh). Then copy all the .dat files in the same level directory of the 10017.hdr.json file to the /path/to/storage/data/0/10017/44622287 directory. This directory is the directory where the debugging environment tablet we obtained in step 3 is located. 10017/44622287 are the tablet id and schema hash respectively.

      Delete the original tablet meta with the meta_tool tool. The tool is located in the be/lib directory.

      Where /path/to/storage is the data root directory of BE. If the deletion is successful, the delete successfully log will appear.

      Load the new tablet meta via the meta_tool tool.

      1. ./lib/meta_tool --root_path=/path/to/storage --operation=load_meta --json_meta_path=/path/to/10017.hdr.json

      If the load is successful, the load successfully log will appear.

    6. Verification

      Restart the debug environment’s BE process (./bin/start_be.sh). Query the table, if correct, you can query the data of the loaded tablet, or reproduce the online problem.