drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Rogers (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DRILL-4909) Refinements to Drill web UI - Query profile page
Date Wed, 28 Sep 2016 19:00:23 GMT

     [ https://issues.apache.org/jira/browse/DRILL-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Paul Rogers updated DRILL-4909:
-------------------------------
    Description: 
The plan I'm looking at has hundreds of nodes. It takes a long time to scroll around the pages
to get to the top. Fix the tab bar at the top of the page to simplify navigation. "The one
with Physical Plan, Visualization, etc."

On the Physical Plan page: The top of the page displays histogram of minor fragment execution.
However, it is hard to infer what it displays.

* Label the x-axis. The units seem to be seconds, but a legend of: "Runtime (sec.)" would
help.
* Label the y-axis. Seems to be colored by major fragment, lines by minor fragment. But, took
some sleuthing to figure this out.
* Tooltip on each color band to identify the major fragment. (Probably too fiddly to label
minor fragment lines.)
* Choose a wider palette of colors. On my chart, the top two groups are shades of organge,
the third is blue. Seems we could rotate among the standard set of colors for better contrast.

In the tables:

* For each operator, list the number of rows processed. (Available in the details already.)
* In the table that sumarizes major fragments, have as a tool-tip the names of the minor fragments
to give the numbers some meaning. That is, hovering over 00-xx-xx should say "Project, Merging
Receiver".
* In the table that shows minor fragments for major fragments, either add a list of minor
fragment names to the title, or as a pop-up. That is, in the heading that says, "Major Fragment:
02-xx-xx", add "(PARQUET_ROW_GROUP_SCAN, PROJECT, ...)
* For each minor fragment, label the host on which it runs.
* For larger queries, shows groups by host, expanded to show fragments. (It is hard to read,
say, 400 minor fragments in one big table. Showing 20 nodes (with summaries) is easier, each
expanding to show 20 minor fragments.

In the Operator Profiles overview, add a tool-tip with details about each operator such as:

* Number of vector allocations
* Number of vector extensions (increasing the size of vectors)
* Average vector utilization (ratio of selected to unselected rows)
* Average batch size: number of rows, bytes per row, bytes per batch

For scanners:

* Number of files scanned
* Number of schemas found
* Number of bytes read (or file length if a table scan)
* Name of the file scanned (or first several if a group)

For filters:

* Rows in, rows out and selectivity (as a ratio)

In the operator detail table:

* Add a line for totals (records, batches)
* Add a line for averages (most fields)

Under the "Full JSON Profile", part of the JSON is formatted, but the plan part is not. Display
the plan in a formatted version (with proper indentation). It is not very useful in the current,
streamed, non-indented form.

Better, move the JSON Profile to a new tab since it causes the Physical Plan page to get too
large for large queries.

On the Visualized Plan page,

* The coloring of the fragments does not match the coloring used in the chart on the Physical
Plan page. Please use the same to make them easier to correlate.
* Perhaps enclose each fragment in a box (with the border passing through the middle of each
eachange operator.
* For each aspect of the plan, provide basic stats such as number of minor fragments, number
of records, average time.
* For each node, provide a link to the Physical Plan page to see more detail.
* The visualized plan page shows the same info as the Physical Plan page. Better, keep each
page focused and make it easy to navigate between them.
* Naming is inconsistent between this page and the Physical Plan page. "HASH_AGGREGATE" on
the Physical Plan page, "HashAgg" on the visualization page.

On the Edit Query page, the actual field to edit the query is two lines long (but my query
is over a dozen lines.) Then, the rest of the page repeats the chart, details, etc. Seems
that, if I want to edit the query, I should have a field large enough to do so. And, I don't
need the other info (I'll go to the corresponding tab it I want to see it.)

  was:
The plan I'm looking at has hundreds of nodes. It takes a long time to scroll around the pages
to get to the top. Fix the tab bar at the top of the page to simplify navigation. "The one
with Physical Plan, Visualization, etc."

On the Physical Plan page: The top of the page displays histogram of minor fragment execution.
However, it is hard to infer what it displays.

* Label the x-axis. The units seem to be seconds, but a legend of: "Runtime (sec.)" would
help.
* Label the y-axis. Seems to be colored by major fragment, lines by minor fragment. But, took
some sleuthing to figure this out.
* Tooltip on each color band to identify the major fragment. (Probably too fiddly to label
minor fragment lines.)
* Choose a wider palette of colors. On my chart, the top two groups are shades of organge,
the third is blue. Seems we could rotate among the standard set of colors for better contrast.

In the tables:

* For each operator, list the number of rows processed. (Available in the details already.)
* In the table that sumarizes major fragments, have as a tool-tip the names of the minor fragments
to give the numbers some meaning. That is, hovering over 00-xx-xx should say "Project, Merging
Receiver".
* In the table that shows minor fragments for major fragments, either add a list of minor
fragment names to the title, or as a pop-up. That is, in the heading that says, "Major Fragment:
02-xx-xx", add "(PARQUET_ROW_GROUP_SCAN, PROJECT, ...)
* For each minor fragment, label the host on which it runs.
* For larger queries, shows groups by host, expanded to show fragments. (It is hard to read,
say, 400 minor fragments in one big table. Showing 20 nodes (with summaries) is easier, each
expanding to show 20 minor fragments.

In the Operator Profiles overview, add a tool-tip with details about each operator such as:

* Number of vector allocations
* Number of vector extensions (increasing the size of vectors)
* Average vector utilization (ratio of selected to unselected rows)
* Average batch size: number of rows, bytes per row, bytes per batch

For scanners:

* Number of files scanned
* Number of schemas found
* Number of bytes read (or file length if a table scan)
* Name of the file scanned (or first several if a group)

For filters:

* Rows in, rows out and selectivity (as a ratio)

In the operator detail table:

* Add a line for totals (records, batches)
* Add a line for averages (most fields)

Under the "Full JSON Profile", part of the JSON is formatted, but the plan part is not. Display
the plan in a formatted version (with proper indentation). It is not very useful in the current,
streamed, non-indented form.

Better, move the JSON Profile to a new tab since it causes the Physical Plan page to get too
large for large queries.

On the Visualized Plan page,

* The coloring of the fragments does not match the coloring used in the chart on the Physical
Plan page. Please use the same to make them easier to correlate.
* Perhaps enclose each fragment in a box (with the border passing through the middle of each
eachange operator.
* For each aspect of the plan, provide basic stats such as number of minor fragments, number
of records, average time.
* For each node, provide a link to the Physical Plan page to see more detail.
* The visualized plan page shows the same info as the Physical Plan page. Better, keep each
page focused and make it easy to navigate between them.
* Naming is inconsistent between this page and the Physical Plan page. "HASH_AGGREGATE" on
the Physical Plan page, "HashAgg" on the visualization page.


> Refinements to Drill web UI - Query profile page
> ------------------------------------------------
>
>                 Key: DRILL-4909
>                 URL: https://issues.apache.org/jira/browse/DRILL-4909
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Web Server
>    Affects Versions: 1.8.0
>            Reporter: Paul Rogers
>            Priority: Minor
>
> The plan I'm looking at has hundreds of nodes. It takes a long time to scroll around
the pages to get to the top. Fix the tab bar at the top of the page to simplify navigation.
"The one with Physical Plan, Visualization, etc."
> On the Physical Plan page: The top of the page displays histogram of minor fragment execution.
However, it is hard to infer what it displays.
> * Label the x-axis. The units seem to be seconds, but a legend of: "Runtime (sec.)" would
help.
> * Label the y-axis. Seems to be colored by major fragment, lines by minor fragment. But,
took some sleuthing to figure this out.
> * Tooltip on each color band to identify the major fragment. (Probably too fiddly to
label minor fragment lines.)
> * Choose a wider palette of colors. On my chart, the top two groups are shades of organge,
the third is blue. Seems we could rotate among the standard set of colors for better contrast.
> In the tables:
> * For each operator, list the number of rows processed. (Available in the details already.)
> * In the table that sumarizes major fragments, have as a tool-tip the names of the minor
fragments to give the numbers some meaning. That is, hovering over 00-xx-xx should say "Project,
Merging Receiver".
> * In the table that shows minor fragments for major fragments, either add a list of minor
fragment names to the title, or as a pop-up. That is, in the heading that says, "Major Fragment:
02-xx-xx", add "(PARQUET_ROW_GROUP_SCAN, PROJECT, ...)
> * For each minor fragment, label the host on which it runs.
> * For larger queries, shows groups by host, expanded to show fragments. (It is hard to
read, say, 400 minor fragments in one big table. Showing 20 nodes (with summaries) is easier,
each expanding to show 20 minor fragments.
> In the Operator Profiles overview, add a tool-tip with details about each operator such
as:
> * Number of vector allocations
> * Number of vector extensions (increasing the size of vectors)
> * Average vector utilization (ratio of selected to unselected rows)
> * Average batch size: number of rows, bytes per row, bytes per batch
> For scanners:
> * Number of files scanned
> * Number of schemas found
> * Number of bytes read (or file length if a table scan)
> * Name of the file scanned (or first several if a group)
> For filters:
> * Rows in, rows out and selectivity (as a ratio)
> In the operator detail table:
> * Add a line for totals (records, batches)
> * Add a line for averages (most fields)
> Under the "Full JSON Profile", part of the JSON is formatted, but the plan part is not.
Display the plan in a formatted version (with proper indentation). It is not very useful in
the current, streamed, non-indented form.
> Better, move the JSON Profile to a new tab since it causes the Physical Plan page to
get too large for large queries.
> On the Visualized Plan page,
> * The coloring of the fragments does not match the coloring used in the chart on the
Physical Plan page. Please use the same to make them easier to correlate.
> * Perhaps enclose each fragment in a box (with the border passing through the middle
of each eachange operator.
> * For each aspect of the plan, provide basic stats such as number of minor fragments,
number of records, average time.
> * For each node, provide a link to the Physical Plan page to see more detail.
> * The visualized plan page shows the same info as the Physical Plan page. Better, keep
each page focused and make it easy to navigate between them.
> * Naming is inconsistent between this page and the Physical Plan page. "HASH_AGGREGATE"
on the Physical Plan page, "HashAgg" on the visualization page.
> On the Edit Query page, the actual field to edit the query is two lines long (but my
query is over a dozen lines.) Then, the rest of the page repeats the chart, details, etc.
Seems that, if I want to edit the query, I should have a field large enough to do so. And,
I don't need the other info (I'll go to the corresponding tab it I want to see it.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message