impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Tauber-Marshall (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5870: Improve explain/profile output for partial sort
Date Thu, 21 Sep 2017 21:42:17 GMT
Thomas Tauber-Marshall has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8123


Change subject: IMPALA-5870: Improve explain/profile output for partial sort
......................................................................

IMPALA-5870: Improve explain/profile output for partial sort

A recent change (IMPALA-5498) added the ability to do partial sorts,
which divide their input up into runs each of which is sorted
individually, avoiding the need to spill. Some of the debug output
wasn't updated vs. regular sorts, leading to confusion.

For EXPLAIN, this patch removes the 'spill-buffer' mem-estimate for
partial sorts, since they can't spill. It does this by setting the
spillable buffer size in the resource profile to -1. Since the BE
relied on that number to determine the page size for sorts, it
now calculates the page size from the min reservation, which gives
an equivalent value.

For the runtime profile, it removes the counters 'SpilledRuns' and
'MergesPerformed' since they will always be 0, and it renames the
'IntialRunsCreated' counter to 'RunsCreated' since the 'Initial'
refers to the fact that in a regular sort those runs may be spilled
or merged.

It also adds a profile info string 'SortType' that can take the values
'Total', 'TopN', or 'Partial' to reflect the type of exec node being
used.

Example profile snippet for a partial sort:
SORT_NODE (id=2):(Total: 403.261us, non-child: 382.029us, % non-child: 94.73%)
 SortType: Partial
 ExecOption: Codegen Enabled
    - NumRowsPerRun: (Avg: 44 (44) ; Min: 44 (44) ; Max: 44 (44) ; Number of samples: 1)
    - InMemorySortTime: 34.201us
    - PeakMemoryUsage: 2.02 MB (2117632)
    - RowsReturned: 44 (44)
    - RowsReturnedRate: 109.11 K/sec
    - RunsCreated: 1 (1)
    - SortDataSize: 572.00 B (572)

Testing:
- Manually ran several sorting queries and inspected their explain
  plans and profile

Change-Id: I2b15af78d8299db8edc44ff820c85db1cbe0be1b
---
M be/src/exec/partial-sort-node.cc
M be/src/exec/sort-node.cc
M be/src/exec/topn-node.cc
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
M fe/src/main/java/org/apache/impala/planner/SortNode.java
6 files changed, 22 insertions(+), 9 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8123/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I2b15af78d8299db8edc44ff820c85db1cbe0be1b
Gerrit-Change-Number: 8123
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall <tmarshall@cloudera.com>

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message