Web Interface

    The main page has a list of queries along with information like unique query ID, query text, query state, percentage completed, username and source from which this query originated. The currently running queries are at the top of the page, followed by the most recently completed or failed queries.

    The possible query states are as follows:

    • – Query has been accepted and is awaiting execution.

    • – Query has at least one running task.

    • BLOCKED – Query is blocked and is waiting for resources (buffer space, memory, splits, etc.).

    • – Query has finished executing and all output has been consumed.

    • FAILED – Query execution failed.

    The BLOCKED state is normal, but if it is persistent, it should be investigated. It has many potential causes: insufficient memory or splits, disk or network I/O bottlenecks, data skew (all the data goes to a few workers), a lack of parallelism (only a few workers available), or computationally expensive stages of the query following a given stage. Additionally, a query can be in the state if a client is not processing the data fast enough (common with “SELECT *” queries).

    The summary section has a button to kill the currently running query. There are two visualizations available in the summary section: task execution and timeline. The full JSON document containing information and statistics about the query is available by clicking the JSON link. These visualizations and other statistics can be used to analyze where time is being spent for a query.