Overall Tasks Storage Structure

The following shows the ‘t_ds_process_definition’ table structure:

The ‘process_definition_json’ field is the core field, which defines the task information in the DAG diagram, and it is stored in JSON format.

The following table describes the common data structure.

No.fieldtypedescription
1globalParamsArrayglobal parameters
2tasksArraytask collections in the process [for the structure of each type, please refer to the following sections]
3tenantIdinttenant ID
4timeoutinttimeout

Data example:

The Detailed Explanation of The Storage Structure of Each Task Type

The node data structure is as follows:

No.parameter nametypedescriptionnotes
1idStringtask Id
2typeStringtask typeSHELL
3nameStringtask name
4paramsObjectcustomized parametersJson format
5rawScriptStringShell script
6localParamsArraycustomized local parameters
7resourceListArrayresource files
8descriptionStringdescription
9runFlagStringexecution flag
10conditionResultObjectcondition branch
11successNodeArrayjump to node if success
12failedNodeArrayjump to node if failure
13dependenceObjecttask dependencymutual exclusion with params
14maxRetryTimesStringmax retry times
15retryIntervalStringretry interval
16timeoutObjecttimeout
17taskInstancePriorityStringtask priority
18workerGroupStringWorker group
19preTasksArraypreposition tasks

Node data example:

  1. "type":"SHELL",
  2. "id":"tasks-80760",
  3. "name":"Shell Task",
  4. "params":{
  5. "resourceList":[
  6. {
  7. "id":3,
  8. "name":"run.sh",
  9. "res":"run.sh"
  10. }
  11. ],
  12. "localParams":[
  13. ],
  14. "rawScript":"echo "This is a shell script""
  15. },
  16. "description":"",
  17. "runFlag":"NORMAL",
  18. "conditionResult":{
  19. "successNode":[
  20. ""
  21. ],
  22. "failedNode":[
  23. ""
  24. ]
  25. },
  26. "dependence":{
  27. },
  28. "maxRetryTimes":"0",
  29. "retryInterval":"1",
  30. "timeout":{
  31. "strategy":"",
  32. "interval":null,
  33. "enable":false
  34. },
  35. "taskInstancePriority":"MEDIUM",
  36. "workerGroup":"default",
  37. "preTasks":[
  38. ]
  39. }

SQL Node

Perform data query and update operations on the specified datasource through SQL.

The node data structure is as follows:

No.parameter nametypedescriptionnote
1idStringtask id
2typeStringtask typeSQL
3nameStringtask name
4paramsObjectcustomized parametersJson format
5typeStringdatabase type
6datasourceIntdatasource id
7sqlStringquery SQL statement
8udfsStringudf functionsspecify UDF function ids, separate by comma
9sqlTypeStringSQL node type0 for query and 1 for none-query SQL
10titleStringmail title
11receiversStringreceivers
12receiversCcStringCC receivers
13showTypeStringdisplay type of mailoptionals: TABLE or ATTACHMENT
14connParamsStringconnect parameters
15preStatementsArraypreposition SQL statements
16postStatementsArraypostposition SQL statements
17localParamsArraycustomized parameters
18descriptionStringdescription
19runFlagStringexecution flag
20conditionResultObjectcondition branch
21successNodeArrayjump to node if success
22failedNodeArrayjump to node if failure
23dependenceObjecttask dependencymutual exclusion with params
24maxRetryTimesStringmax retry times
25retryIntervalStringretry interval
26timeoutObjecttimeout
27taskInstancePriorityStringtask priority
28workerGroupStringWorker group
29preTasksArraypreposition tasks

Node data example:

  1. {
  2. "type":"SQL",
  3. "id":"tasks-95648",
  4. "name":"SqlTask-Query",
  5. "params":{
  6. "type":"MYSQL",
  7. "datasource":1,
  8. "sql":"select id , namge , age from emp where id = ${id}",
  9. "udfs":"",
  10. "sqlType":"0",
  11. "title":"xxxx@xxx.com",
  12. "receivers":"xxxx@xxx.com",
  13. "receiversCc":"",
  14. "showType":"TABLE",
  15. "localParams":[
  16. {
  17. "prop":"id",
  18. "direct":"IN",
  19. "type":"INTEGER",
  20. "value":"1"
  21. }
  22. ],
  23. "connParams":"",
  24. "preStatements":[
  25. "insert into emp ( id,name ) value (1,'Li' )"
  26. ],
  27. "postStatements":[
  28. ]
  29. },
  30. "description":"",
  31. "runFlag":"NORMAL",
  32. "conditionResult":{
  33. "successNode":[
  34. ""
  35. ],
  36. "failedNode":[
  37. ""
  38. ]
  39. },
  40. "dependence":{
  41. },
  42. "maxRetryTimes":"0",
  43. "retryInterval":"1",
  44. "timeout":{
  45. "strategy":"",
  46. "interval":null,
  47. "enable":false
  48. },
  49. "taskInstancePriority":"MEDIUM",
  50. "workerGroup":"default",
  51. "preTasks":[
  52. ]
  53. }

PROCEDURE [stored procedures] Node

SPARK Node

The node data structure is as follows:

Node data example:

  1. {
  2. "type":"SPARK",
  3. "id":"tasks-87430",
  4. "name":"SparkTask",
  5. "params":{
  6. "mainClass":"org.apache.spark.examples.SparkPi",
  7. "mainJar":{
  8. "id":4
  9. },
  10. "deployMode":"cluster",
  11. "resourceList":[
  12. {
  13. "id":3,
  14. "name":"run.sh",
  15. "res":"run.sh"
  16. }
  17. ],
  18. "localParams":[
  19. ],
  20. "driverCores":1,
  21. "driverMemory":"512M",
  22. "numExecutors":2,
  23. "executorMemory":"2G",
  24. "executorCores":2,
  25. "mainArgs":"10",
  26. "others":"",
  27. "programType":"SCALA",
  28. },
  29. "description":"",
  30. "runFlag":"NORMAL",
  31. "conditionResult":{
  32. "successNode":[
  33. ""
  34. ],
  35. "failedNode":[
  36. ""
  37. },
  38. "dependence":{
  39. },
  40. "maxRetryTimes":"0",
  41. "retryInterval":"1",
  42. "timeout":{
  43. "strategy":"",
  44. "interval":null,
  45. "enable":false
  46. },
  47. "taskInstancePriority":"MEDIUM",
  48. "workerGroup":"default",
  49. "preTasks":[
  50. ]
  51. }

The node data structure is as follows:

No.parameter nametypedescriptionnotes
1idStringtask Id
2typeStringtask typeMR
3nameStringtask name
4paramsObjectcustomized parametersJson format
5mainClassStringmain class
6mainArgsStringexecution arguments
7othersStringother arguments
8mainJarObjectapplication jar package
9programTypeStringprogram typeJAVA,PYTHON
10localParamsArraycustomized local parameters
11resourceListArrayresource files
12descriptionStringdescription
13runFlagStringexecution flag
14conditionResultObjectcondition branch
15successNodeArrayjump to node if success
16failedNodeArrayjump to node if failure
17dependenceObjecttask dependencymutual exclusion with params
18maxRetryTimesStringmax retry times
19retryIntervalStringretry interval
20timeoutObjecttimeout
21taskInstancePriorityStringtask priority
22workerGroupStringWorker group
23preTasksArraypreposition tasks

Node data example:

Python Node

The node data structure is as follows:

No.parameter nametypedescriptionnotes
1idStringtask Id
2typeStringtask typePYTHON
3nameStringtask name
4paramsObjectcustomized parametersJson format
5rawScriptStringPython script
6localParamsArraycustomized local parameters
7resourceListArrayresource files
8descriptionStringdescription
9runFlagStringexecution flag
10conditionResultObjectcondition branch
11successNodeArrayjump to node if success
12failedNodeArrayjump to node if failure
13dependenceObjecttask dependencymutual exclusion with params
14maxRetryTimesStringmax retry times
15retryIntervalStringretry interval
16timeoutObjecttimeout
17taskInstancePriorityStringtask priority
18workerGroupStringWorker group
19preTasksArraypreposition tasks

Node data example:

  1. {
  2. "type":"PYTHON",
  3. "id":"tasks-5463",
  4. "name":"Python Task",
  5. "params":{
  6. "resourceList":[
  7. {
  8. "id":3,
  9. "name":"run.sh",
  10. "res":"run.sh"
  11. }
  12. ],
  13. "localParams":[
  14. ],
  15. "rawScript":"print("This is a python script")"
  16. },
  17. "description":"",
  18. "runFlag":"NORMAL",
  19. "conditionResult":{
  20. "successNode":[
  21. ""
  22. ],
  23. "failedNode":[
  24. ""
  25. ]
  26. },
  27. "dependence":{
  28. },
  29. "maxRetryTimes":"0",
  30. "retryInterval":"1",
  31. "timeout":{
  32. "strategy":"",
  33. "interval":null,
  34. "enable":false
  35. },
  36. "taskInstancePriority":"MEDIUM",
  37. "workerGroup":"default",
  38. "preTasks":[
  39. ]
  40. }

The node data structure is as follows:

No.parameter nametypedescriptionnotes
1idStringtask Id
2typeStringtask typeFLINK
3nameStringtask name
4paramsObjectcustomized parametersJson format
5mainClassStringmain class
6mainArgsStringexecution arguments
7othersStringother arguments
8mainJarObjectapplication jar package
9deployModeStringdeployment modelocal,client,cluster
10slotStringslot count
11taskManagerStringtaskManager count
12taskManagerMemoryStringtaskManager memory size
13jobManagerMemoryStringjobManager memory size
14programTypeStringprogram typeJAVA,SCALA,PYTHON
15localParamsArraylocal parameters
16resourceListArrayresource files
17descriptionStringdescription
18runFlagStringexecution flag
19conditionResultObjectcondition branch
20successNodeArrayjump node if success
21failedNodeArrayjump node if failure
22dependenceObjecttask dependencymutual exclusion with params
23maxRetryTimesStringmax retry times
24retryIntervalStringretry interval
25timeoutObjecttimeout
26taskInstancePriorityStringtask priority
27workerGroupStringWorker group
38preTasksArraypreposition tasks

Node data example:

  1. {
  2. "type":"FLINK",
  3. "id":"tasks-17135",
  4. "name":"FlinkTask",
  5. "params":{
  6. "mainClass":"com.flink.demo",
  7. "mainJar":{
  8. "id":6
  9. },
  10. "deployMode":"cluster",
  11. "resourceList":[
  12. {
  13. "id":3,
  14. "name":"run.sh",
  15. "res":"run.sh"
  16. }
  17. ],
  18. "localParams":[
  19. ],
  20. "slot":1,
  21. "taskManager":"2",
  22. "jobManagerMemory":"1G",
  23. "taskManagerMemory":"2G",
  24. "executorCores":2,
  25. "mainArgs":"100",
  26. "others":"",
  27. "programType":"SCALA"
  28. },
  29. "description":"",
  30. "runFlag":"NORMAL",
  31. "conditionResult":{
  32. "successNode":[
  33. ""
  34. ],
  35. "failedNode":[
  36. ""
  37. ]
  38. },
  39. "dependence":{
  40. },
  41. "maxRetryTimes":"0",
  42. "retryInterval":"1",
  43. "timeout":{
  44. "strategy":"",
  45. "interval":null,
  46. "enable":false
  47. },
  48. "taskInstancePriority":"MEDIUM",
  49. "workerGroup":"default",
  50. "preTasks":[
  51. ]
  52. }

HTTP Node

The node data structure is as follows:

  1. {
  2. "type":"HTTP",
  3. "id":"tasks-60499",
  4. "name":"HttpTask",
  5. "params":{
  6. "localParams":[
  7. ],
  8. "httpParams":[
  9. {
  10. "prop":"id",
  11. "httpParametersType":"PARAMETER",
  12. "value":"1"
  13. },
  14. {
  15. "prop":"name",
  16. "value":"Bo"
  17. }
  18. ],
  19. "url":"https://www.xxxxx.com:9012",
  20. "httpCheckCondition":"STATUS_CODE_DEFAULT",
  21. "condition":""
  22. },
  23. "description":"",
  24. "runFlag":"NORMAL",
  25. "conditionResult":{
  26. "successNode":[
  27. ""
  28. ],
  29. "failedNode":[
  30. ""
  31. ]
  32. },
  33. "dependence":{
  34. },
  35. "maxRetryTimes":"0",
  36. "retryInterval":"1",
  37. "timeout":{
  38. "strategy":"",
  39. "interval":null,
  40. "enable":false
  41. },
  42. "taskInstancePriority":"MEDIUM",
  43. "workerGroup":"default",
  44. "preTasks":[
  45. ]
  46. }

The node data structure is as follows:

No.parameter nametypedescriptionnotes
1idStringtask Id
2typeStringtask typeDATAX
3nameStringtask name
4paramsObjectcustomized parametersJson format
5customConfigIntspecify whether use customized config0 none customized, 1 customized
6dsTypeStringdatasource type
7dataSourceIntdatasource ID
8dtTypeStringtarget database type
9dataTargetInttarget database ID
10sqlStringSQL statements
11targetTableStringtarget table
12jobSpeedByteIntjob speed limiting(bytes)
13jobSpeedRecordIntjob speed limiting(records)
14preStatementsArraypreposition SQL
15postStatementsArraypostposition SQL
16jsonStringcustomized configsvalid if customConfig=1
17localParamsArraycustomized parametersvalid if customConfig=1
18descriptionStringdescription
19runFlagStringexecution flag
20conditionResultObjectcondition branch
21successNodeArrayjump node if success
22failedNodeArrayjump node if failure
23dependenceObjecttask dependencymutual exclusion with params
24maxRetryTimesStringmax retry times
25retryIntervalStringretry interval
26timeoutObjecttimeout
27taskInstancePriorityStringtask priority
28workerGroupStringWorker group
29preTasksArraypreposition tasks

Node data example:

Sqoop Node

The node data structure is as follows:

No.parameter nametypedescriptionnotes
1idStringtask ID
2typeStringtask typeSQOOP
3nameStringtask name
4paramsObjectcustomized parametersJson format
5concurrencyIntconcurrency rate
6modelTypeStringflow directionimport,export
7sourceTypeStringdatasource type
8sourceParamsStringdatasource parametersJSON format
9targetTypeStringtarget datasource type
10targetParamsStringtarget datasource parametersJSON format
11localParamsArraycustomized local parameters
12descriptionStringdescription
13runFlagStringexecution flag
14conditionResultObjectcondition branch
15successNodeArrayjump node if success
16failedNodeArrayjump node if failure
17dependenceObjecttask dependencymutual exclusion with params
18maxRetryTimesStringmax retry times
19retryIntervalStringretry interval
20timeoutObjecttimeout
21taskInstancePriorityStringtask priority
22workerGroupStringWorker group
23preTasksArraypreposition tasks

Node data example:

  1. {
  2. "type":"SQOOP",
  3. "id":"tasks-82041",
  4. "name":"Sqoop Task",
  5. "params":{
  6. "concurrency":1,
  7. "modelType":"import",
  8. "sourceType":"MYSQL",
  9. "targetType":"HDFS",
  10. "sourceParams":"{"srcType":"MYSQL","srcDatasource":1,"srcTable":"","srcQueryType":"1","srcQuerySql":"selec id , name from user","srcColumnType":"0","srcColumns":"","srcConditionList":[],"mapColumnHive":[{"prop":"hivetype-key","direct":"IN","type":"VARCHAR","value":"hivetype-value"}],"mapColumnJava":[{"prop":"javatype-key","direct":"IN","type":"VARCHAR","value":"javatype-value"}]}",
  11. "targetParams":"{"targetPath":"/user/hive/warehouse/ods.db/user","deleteTargetDir":false,"fileType":"--as-avrodatafile","compressionCodec":"snappy","fieldsTerminated":",","linesTerminated":"@"}",
  12. "localParams":[
  13. ]
  14. },
  15. "description":"",
  16. "runFlag":"NORMAL",
  17. "conditionResult":{
  18. "successNode":[
  19. ""
  20. ],
  21. "failedNode":[
  22. ""
  23. ]
  24. },
  25. "dependence":{
  26. },
  27. "maxRetryTimes":"0",
  28. "retryInterval":"1",
  29. "timeout":{
  30. "strategy":"",
  31. "interval":null,
  32. "enable":false
  33. },
  34. "taskInstancePriority":"MEDIUM",
  35. "workerGroup":"default",
  36. "preTasks":[
  37. ]
  38. }

Condition Branch Node

The node data structure is as follows:

No.parameter nametypedescriptionnotes
1idStringtask ID
2typeStringtask typeSHELL
3nameStringtask name
4paramsObjectcustomized parametersnull
5descriptionStringdescription
6runFlagStringexecution flag
7conditionResultObjectcondition branch
8successNodeArrayjump to node if success
9failedNodeArrayjump to node if failure
10dependenceObjecttask dependencymutual exclusion with params
11maxRetryTimesStringmax retry times
12retryIntervalStringretry interval
13timeoutObjecttimeout
14taskInstancePriorityStringtask priority
15workerGroupStringWorker group
16preTasksArraypreposition tasks

Node data example:

  1. {
  2. "type":"CONDITIONS",
  3. "id":"tasks-96189",
  4. "name":"条件",
  5. "params":{
  6. },
  7. "description":"",
  8. "runFlag":"NORMAL",
  9. "conditionResult":{
  10. "successNode":[
  11. "test04"
  12. ],
  13. "failedNode":[
  14. "test05"
  15. ]
  16. },
  17. "dependence":{
  18. "relation":"AND",
  19. "dependTaskList":[
  20. ]
  21. },
  22. "maxRetryTimes":"0",
  23. "retryInterval":"1",
  24. "timeout":{
  25. "strategy":"",
  26. "interval":null,
  27. "enable":false
  28. },
  29. "taskInstancePriority":"MEDIUM",
  30. "workerGroup":"default",
  31. "preTasks":[
  32. "test01",
  33. "test02"
  34. ]
  35. }

Subprocess Node

The node data structure is as follows:

Node data example:

  1. {
  2. "type":"SUB_PROCESS",
  3. "id":"tasks-14806",
  4. "name":"SubProcessTask",
  5. "params":{
  6. "processDefinitionId":2
  7. },
  8. "description":"",
  9. "runFlag":"NORMAL",
  10. "conditionResult":{
  11. "successNode":[
  12. ""
  13. ],
  14. "failedNode":[
  15. ""
  16. ]
  17. },
  18. "dependence":{
  19. },
  20. "timeout":{
  21. "strategy":"",
  22. "interval":null,
  23. "enable":false
  24. },
  25. "taskInstancePriority":"MEDIUM",
  26. "workerGroup":"default",
  27. "preTasks":[
  28. }

The node data structure is as follows:

No.parameter nametypedescriptionnotes
1idStringtask ID
2typeStringtask typeDEPENDENT
3nameStringtask name
4paramsObjectcustomized parametersJson format
5rawScriptStringShell script
6localParamsArraycustomized local parameters
7resourceListArrayresource files
8descriptionStringdescription
9runFlagStringexecution flag
10conditionResultObjectcondition branch
11successNodeArrayjump to node if success
12failedNodeArrayjump to node if failure
13dependenceObjecttask dependencymutual exclusion with params
14relationStringrelationAND,OR
15dependTaskListArraydependent task list
16maxRetryTimesStringmax retry times
17retryIntervalStringretry interval
18timeoutObjecttimeout
19taskInstancePriorityStringtask priority
20workerGroupStringWorker group
21preTasksArraypreposition tasks