CS Cleanup Interface Features

    1.2 Goals

    • Modify 1ContextServiceContextIDandContextMap` related data into the database
    • Add restful interface for cleaning and cleaning, support batch and retail cleaning interface according to time range and id list
    • Add the corresponding java sdk interface of cs-client

    2. Overall Design

    This requirement involves cs-client, cs-persistence and cs-server modules under ContextService. Add 3 fields of the existing table in the cs-persistence module; add 3 restful interfaces in the cs-server module, and add 3 sdk api in the cs-client module.

    2.1 Technical Architecture

    For the overall architecture of ContextService, please refer to the existing document:

    The access relationship of each module of ContestService is shown in the following figure

    Table changes are made in the cs-persistence module. This change involves 5 tables context_id, context_map, context_id_listener, context_key_listener, context_history, all of which need to add 3 fields create_time, update_time, access_time. The context_id and context_map tables are enabled, and the other three tables are not enabled. create_time adds the time before the persistence module performs the insert operation. update_time and access_time are actively called by the upstream interface. In the update interface, update_time and access_time are mutually exclusive updates, that is, when access_time exists (not null), update_time will not be updated, otherwise update_time will be updated .

    The update_time field is updated in the cs-cache module, the ADD message is detected when a new context_id is loaded from the db, and the access_time is synchronized to the db at this time. Only the create_time, update_time, access_time of the context_id table are recorded in the table. Subsequent search cleaning is also performed from the context_id table.

    • searchContextIDByTime searches according to 3 time ranges and returns a list of contextIDs
    • clearAllContextByID clears the content of the context_map table and context_id table corresponding to the ID in the input contextID list
    • clearAllContextByTime searches according to 3 time ranges, and clears all the contents of the context_map table and context_id table corresponding to the searched contextID
    • Create ContextID. When the user creates the ContextID, the create_time will be recorded, and this field will not be updated later
    • Update ContextID. When the user updates the ContextID, the update_time field is updated. Note that if the update is from the cache at this time, the access_time field will not be updated; if it is loaded from the db to the cache and then updated with the new contextID, the access_time will be updated first, and then the new update_time will be updated separately.
    • Query ContextID according to time. When the user queries the ContextID of the corresponding time range, only a list of haid strings will be returned. This interface has paging, the default is limited to 5000 pieces of data
    • Bulk cleanup of ContextIDs. All contextMap data and contextID data corresponding to the incoming idList will be cleaned up in batches. The maximum number of incoming arrays is 5000
    • Query and clear ContextID, first query and then batch clear

    The corresponding timing diagrams above are as follows: linkis-contextservice-clean-02.png

    Two of them require additional attention: ①The restful api in the cs-server service will encapsulate the request as a Job and submit it to the queue and block waiting for the result. The operation type of CLEAR is newly defined to facilitate matching to the cleanup related interface. ② To process the Service service of the Job in ①, the name needs to be defined as not including the ContextID to avoid the dynamic proxy conversion of the HighAvailable module. This conversion is only for the interface with only one ContextID in the request, and it is meaningless and affects the batch cleanup and batch query interface. performance.

    4. Data structure

    1 Query ID interface searchContextIDByTime

    ①Interface name GET /api/rest_j/v1/contextservice/searchContextIDByTime

    ②Input parameters

    Parameter nameParameter descriptionRequest typeRequiredData typeschema
    accessTimeEndAccess end timequeryfalsestring
    accessTimeStartAccess start timequeryfalsestring
    createTimeEndCreate end timequeryfalsestring
    createTimeStartcreate timequeryfalsestring
    pageNowpage numberqueryfalsestring
    pageSizepage sizequeryfalsestring
    updateTimeEndUpdate end timequeryfalsestring
    updateTimeStartUpdate timequeryfalsestring
    1. "method": "/api/contextservice/searchContextIDByTime",
    2. "status": 0,
    3. "message": "OK",
    4. "data": {
    5. "contextIDs": [
    6. "8-8--cs_1_devcs_2_dev10493",
    7. "8-8--cs_1_devcs_2_dev10495",
    8. "8-8--cs_1_devcs_2_dev10496",
    9. "8-8--cs_1_devcs_2_dev10497",
    10. "8-8--cs_2_devcs_2_dev10498"
    11. ]
    12. }
    13. }
    1. Clear the specified ID interface clearAllContextByID

    ①Interface name POST /api/rest_j/v1/contextservice/clearAllContextByID ② Example of input parameters

    ③Example of output parameters

    1. {
    2. "method": "/api/contextservice/clearAllContextByID",
    3. "status": 0,
    4. "message": "OK",
    5. "data": {
    6. "num": "1"
    7. }
    8. }
    1. Clean up the interface clearAllContextByTime according to the time ①Interface name POST /api/rest_j/v1/contextservice/clearAllContextByTime ② Example of input parameters { “createTimeStart”: “2022-06-01 00:00:00”, “createTimeEnd”: “2022-06-30 00:00:00” } ③Example of output parameters

    5.2 JAVA SDK API

    1. # import pom
    2. <dependency>
    3. <groupId>org.apache.linkis</groupId>
    4. <artifactId>linkis-cs-client</artifactId>
    5. <version>1.1.3</version>
    6. </dependency>
    7. # Code reference is as follows
    8. String createTimeEnd = "2022-06-01 24:00:00";
    9. ContextClient contextClient = ContextClientFactory.getOrCreateContextClient();
    10. # Interface 1 searchHAIDByTime
    11. List<String> idList =
    12. contextClient.searchHAIDByTime(
    13. createTimeStart, createTimeEnd, null, null, null, null, 0, 0);
    14. for (String id : idList) {
    15. System.out.println(id);
    16. }
    17. System.out.println("Got " + idList.size() + "ids.");
    18. if (idList.size() > 0) {
    19. String id1 = idList.get(0);
    20. System.out.println("will clear context of id : " + id1);
    21. }
    22. # Interface 2 batchClearContextByHAID
    23. List<String> tmpList = new ArrayList<>();
    24. tmpList.add(id1);
    25. int num = contextClient.batchClearContextByHAID(tmpList);
    26. System.out.println("Succeed to clear " + num + " ids.");
    27. # Interface 3 batchClearContextByTime
    28. int num1 =
    29. contextClient.batchClearContextByTime(
    30. System.out.println("Succeed to clear " + num1 + " ids by time.");

    6. Non-functional design

    6.1 Security

    The resultful interface requires login authentication and requires an administrator to operate. The administrator user is configured in the properties file

    • The query ID interface searchContextIDByTime has paging, no performance impact
    • The interface clearAllContextByTime is cleared according to the time. If the query time range is too large, the query may time out, but the task will not fail. and the cleanup operation is a single operation and does not affect other queries

    6.3 Capacity

    This requirement provides a time range query and batch cleaning interface, which requires the upper-layer application that uses ContextService to actively clean up data.

    6.4 High Availability

    The interface reuses the high availability of the ContextService microservice itself.