ML Commons API



    In order to train tasks through the API, three inputs are required.

    • Algorithm name: Must be one of a FunctionName. This determines what algorithm the ML Engine runs. To add a new function, see .
    • Model hyper parameters: Adjust these parameters to make the model train better.
    • Input data: The data input that trains the ML model, or applies the ML models to predictions. You can input data in two ways, query against your index or use data frame.

    Training can occur both synchronously and asynchronously.

    The following examples use the kmeans algorithm to train index data.

    Train with kmeans synchronously

    Train with kmeans asynchronously

    1. {
    2. "parameters": {
    3. "centroids": 3,
    4. "iterations": 10,
    5. "distance_type": "COSINE"
    6. },
    7. "input_query": {
    8. "_source": ["petal_length_in_cm", "petal_width_in_cm"],
    9. "size": 10000
    10. },
    11. "input_index": [
    12. "iris_data"
    13. ]
    14. }

    Response

    Synchronously

    For synchronous responses, the API returns the model_id, which can be used to get or delete a model.

    1. {
    2. "model_id" : "lblVmX8BO5w8y8RaYYvN",
    3. "status" : "COMPLETED"
    4. }

    Asynchronously

    For asynchronous responses, the API returns the task_id, which can be used to get or delete a task.

    1. {
    2. "task_id" : "lrlamX8BO5w8y8Ra2otd",
    3. "status" : "CREATED"
    4. }

    Get model information

    1. GET /_plugins/_ml/models/<model-id>

    The API returns information on the model, the algorithm used, and the content found within the model.

    1. {
    2. "name" : "KMEANS",
    3. "algorithm" : "KMEANS",
    4. "version" : 1,
    5. "content" : ""
    6. }

    Search model

    Use this command to search models you’re already created.

    1. POST /_plugins/_ml/models/_search
    2. {query}

    Example: Query all models

    1. POST /_plugins/_ml/models/_search
    2. {
    3. "query": {
    4. "match_all": {}
    5. },
    6. "size": 1000
    7. }

    Example: Query models with algorithm “FIT_RCF”

    1. POST /_plugins/_ml/models/_search
    2. {
    3. "query": {
    4. "term": {
    5. "algorithm": {
    6. "value": "FIT_RCF"
    7. }
    8. }
    9. }
    10. }

    Response

    1. {
    2. "took" : 8,
    3. "timed_out" : false,
    4. "_shards" : {
    5. "total" : 1,
    6. "successful" : 1,
    7. "skipped" : 0,
    8. "failed" : 0
    9. },
    10. "hits" : {
    11. "total" : {
    12. "value" : 2,
    13. "relation" : "eq"
    14. },
    15. "max_score" : 2.4159138,
    16. "hits" : [
    17. {
    18. "_index" : ".plugins-ml-model",
    19. "_id" : "-QkKJX8BvytMh9aUeuLD",
    20. "_version" : 1,
    21. "_seq_no" : 12,
    22. "_primary_term" : 15,
    23. "_score" : 2.4159138,
    24. "_source" : {
    25. "name" : "FIT_RCF",
    26. "version" : 1,
    27. "content" : "xxx",
    28. "algorithm" : "FIT_RCF"
    29. }
    30. },
    31. {
    32. "_index" : ".plugins-ml-model",
    33. "_id" : "OxkvHn8BNJ65KnIpck8x",
    34. "_version" : 1,
    35. "_seq_no" : 2,
    36. "_primary_term" : 8,
    37. "_score" : 2.4159138,
    38. "_source" : {
    39. "name" : "FIT_RCF",
    40. "version" : 1,
    41. "content" : "xxx",
    42. "algorithm" : "FIT_RCF"
    43. }
    44. }
    45. ]
    46. }
    47. }

    Deletes a model based on the model_id

    1. DELETE /_plugins/_ml/models/<model_id>

    The API returns the following:

    Predict

    ML Commons can predict new data with your trained model either from indexed data or a data frame. To use the Predict API, the model_id is required.

    1. POST /_plugins/_ml/_predict/<algorithm_name>/<model_id>
    1. POST /_plugins/_ml/_predict/kmeans/<model-id>
    2. {
    3. "input_query": {
    4. "_source": ["petal_length_in_cm", "petal_width_in_cm"],
    5. "size": 10000
    6. },
    7. "input_index": [
    8. "iris_data"
    9. ]
    10. }

    Response

    1. {
    2. "status" : "COMPLETED",
    3. "prediction_result" : {
    4. "column_metas" : [
    5. {
    6. "name" : "ClusterID",
    7. "column_type" : "INTEGER"
    8. }
    9. ],
    10. "rows" : [
    11. {
    12. "values" : [
    13. {
    14. "column_type" : "INTEGER",
    15. "value" : 1
    16. }
    17. ]
    18. },
    19. {
    20. "values" : [
    21. "column_type" : "INTEGER",
    22. "value" : 1
    23. }
    24. ]
    25. },
    26. {
    27. "values" : [
    28. {
    29. "column_type" : "INTEGER",
    30. "value" : 0
    31. }
    32. ]
    33. },
    34. {
    35. "values" : [
    36. {
    37. "column_type" : "INTEGER",
    38. "value" : 0
    39. ]
    40. },
    41. {
    42. "values" : [
    43. {
    44. "column_type" : "INTEGER",
    45. "value" : 0
    46. }
    47. ]
    48. },
    49. {
    50. "values" : [
    51. {
    52. "column_type" : "INTEGER",
    53. "value" : 0
    54. }
    55. ]
    56. }
    57. ]
    58. }

    Train and predict

    Use to train and then immediately predict against the same training data set. Can only be used with unsupervised learning models and the following algorithms:

    • BATCH_RCF
    • FIT_RCF
    • kmeans

    Example: Train and predict with indexed data

    1. POST /_plugins/_ml/_train_predict/kmeans
    2. {
    3. "parameters": {
    4. "centroids": 2,
    5. "iterations": 10,
    6. "distance_type": "COSINE"
    7. },
    8. "input_query": {
    9. "query": {
    10. "bool": {
    11. "filter": [
    12. {
    13. "range": {
    14. "k1": {
    15. "gte": 0
    16. }
    17. }
    18. }
    19. ]
    20. }
    21. },
    22. "size": 10
    23. },
    24. "input_index": [
    25. "test_data"
    26. ]
    27. }

    Example: Train and predict with data directly

    1. POST /_plugins/_ml/_train_predict/kmeans
    2. {
    3. "parameters": {
    4. "centroids": 2,
    5. "iterations": 1,
    6. "distance_type": "EUCLIDEAN"
    7. },
    8. "input_data": {
    9. "column_metas": [
    10. {
    11. "name": "k1",
    12. "column_type": "DOUBLE"
    13. },
    14. {
    15. "name": "k2",
    16. "column_type": "DOUBLE"
    17. }
    18. ],
    19. "rows": [
    20. {
    21. "values": [
    22. {
    23. "column_type": "DOUBLE",
    24. "value": 1.00
    25. },
    26. {
    27. "column_type": "DOUBLE",
    28. "value": 2.00
    29. }
    30. ]
    31. },
    32. {
    33. "values": [
    34. {
    35. "column_type": "DOUBLE",
    36. "value": 1.00
    37. },
    38. {
    39. "column_type": "DOUBLE",
    40. "value": 4.00
    41. }
    42. ]
    43. },
    44. {
    45. "values": [
    46. {
    47. "column_type": "DOUBLE",
    48. "value": 1.00
    49. },
    50. {
    51. "column_type": "DOUBLE",
    52. "value": 0.00
    53. }
    54. ]
    55. },
    56. {
    57. "values": [
    58. {
    59. "column_type": "DOUBLE",
    60. "value": 10.00
    61. },
    62. {
    63. "column_type": "DOUBLE",
    64. "value": 2.00
    65. }
    66. ]
    67. },
    68. {
    69. "values": [
    70. {
    71. "column_type": "DOUBLE",
    72. "value": 10.00
    73. },
    74. {
    75. "column_type": "DOUBLE",
    76. "value": 4.00
    77. }
    78. ]
    79. },
    80. {
    81. "values": [
    82. {
    83. "column_type": "DOUBLE",
    84. "value": 10.00
    85. },
    86. {
    87. "column_type": "DOUBLE",
    88. "value": 0.00
    89. ]
    90. }
    91. ]
    92. }
    93. }

    Response

    1. {
    2. "status" : "COMPLETED",
    3. "prediction_result" : {
    4. "column_metas" : [
    5. {
    6. "name" : "ClusterID",
    7. }
    8. ],
    9. "rows" : [
    10. {
    11. "values" : [
    12. {
    13. "column_type" : "INTEGER",
    14. "value" : 1
    15. }
    16. ]
    17. },
    18. {
    19. "values" : [
    20. {
    21. "column_type" : "INTEGER",
    22. "value" : 1
    23. }
    24. ]
    25. },
    26. {
    27. "values" : [
    28. {
    29. "column_type" : "INTEGER",
    30. "value" : 1
    31. }
    32. ]
    33. },
    34. {
    35. "values" : [
    36. {
    37. "column_type" : "INTEGER",
    38. "value" : 0
    39. }
    40. ]
    41. },
    42. {
    43. "values" : [
    44. {
    45. "column_type" : "INTEGER",
    46. "value" : 0
    47. }
    48. ]
    49. },
    50. {
    51. "values" : [
    52. {
    53. "column_type" : "INTEGER",
    54. "value" : 0
    55. }
    56. ]
    57. }
    58. ]
    59. }
    60. }

    You can retrieve information about a task using the task_id.

    1. GET /_plugins/_ml/tasks/<task_id>

    The response includes information about the task.

    1. {
    2. "model_id" : "l7lamX8BO5w8y8Ra2oty",
    3. "task_type" : "TRAINING",
    4. "function_name" : "KMEANS",
    5. "state" : "COMPLETED",
    6. "input_type" : "SEARCH_QUERY",
    7. "worker_node" : "54xOe0w8Qjyze00UuLDfdA",
    8. "create_time" : 1647545342556,
    9. "last_update_time" : 1647545342587,
    10. "is_async" : true
    11. }

    Search task

    Search tasks based on parameters indicated in the request body.

    1. GET /_plugins/_ml/tasks/_search
    2. {query body}
    1. GET /_plugins/_ml/tasks/_search
    2. {
    3. "query": {
    4. "bool": {
    5. "filter": [
    6. {
    7. "term": {
    8. "function_name": "KMEANS"
    9. }
    10. }
    11. ]
    12. }
    13. }
    14. }

    Response

    Delete task

    ML Commons does not check the task status when running the Delete request. There is a risk that a currently running task could be deleted before the task completes. To check the status of a task, run GET /_plugins/_ml/tasks/<task_id> before task deletion.

    1. DELETE /_plugins/_ml/tasks/{task_id}

    The API returns the following:

    1. {
    2. "_index" : ".plugins-ml-task",
    3. "_id" : "xQRYLX8BydmmU1x6nuD3",
    4. "_version" : 4,
    5. "result" : "deleted",
    6. "_shards" : {
    7. "total" : 2,
    8. "successful" : 2,
    9. "failed" : 0
    10. },
    11. "_seq_no" : 42,
    12. "_primary_term" : 7
    13. }

    Get statistics related to the number of tasks.

    To receive all stats, use:

    1. GET /_plugins/_ml/stats

    To receive stats for a specific node, use:

    1. GET /_plugins/_ml/<nodeId>/stats/

    To receive stats for a specific node and return a specified stat, use:

    1. GET /_plugins/_ml/<nodeId>/stats/<stat>

    To receive information on a specific stat from all nodes, use:

    1. GET /_plugins/_ml/stats/<stat>

    Example: Get all stats

    1. GET /_plugins/_ml/stats

    Response

    1. {
    2. "zbduvgCCSOeu6cfbQhTpnQ" : {
    3. "ml_executing_task_count" : 0
    4. },
    5. "54xOe0w8Qjyze00UuLDfdA" : {
    6. "ml_executing_task_count" : 0
    7. },
    8. "UJiykI7bTKiCpR-rqLYHyw" : {
    9. "ml_executing_task_count" : 0
    10. },
    11. "zj2_NgIbTP-StNlGZJlxdg" : {
    12. "ml_executing_task_count" : 0
    13. },
    14. "jjqFrlW7QWmni1tRnb_7Dg" : {
    15. "ml_executing_task_count" : 0
    16. },
    17. "3pSSjl5PSVqzv5-hBdFqyA" : {
    18. "ml_executing_task_count" : 0
    19. },
    20. "A_IiqoloTDK01uZvCjREaA" : {
    21. "ml_executing_task_count" : 0
    22. }
    23. }

    Execute

    Some algorithms, such as Localization, don’t require trained models. You can run no-model-based algorithms using the execute API.

    1. POST _plugins/_ml/_execute/<algorithm_name>

    Example: Execute localization

    The following example uses the Localization algorithm to find subset-level information for aggregate data (for example, aggregated over time) that demonstrates the activity of interest, such as spikes, drops, changes, or anomalies.

    1. POST /_plugins/_ml/_execute/anomaly_localization
    2. {
    3. "index_name": "rca-index",
    4. "attribute_field_names": [
    5. "attribute"
    6. ],
    7. "aggregations": [
    8. {
    9. "sum": {
    10. "sum": {
    11. "field": "value"
    12. }
    13. }
    14. }
    15. ],
    16. "time_field_name": "timestamp",
    17. "start_time": 1620630000000,
    18. "end_time": 1621234800000,
    19. "min_time_interval": 86400000,