impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Tauber-Marshall (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] DRAFT - IMPALA-5498: Support for partial sorts
Date Thu, 22 Jun 2017 21:45:19 GMT
Thomas Tauber-Marshall has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7267

Change subject: DRAFT - IMPALA-5498: Support for partial sorts
......................................................................

DRAFT - IMPALA-5498: Support for partial sorts

Impala currently supports total sorts (the entire set of data
is sorted) and top-n sorts (only the highest/lowest n elements
are sorted). This patch adds the ability to do partial sorts,
where the data is divided up into some number of subsets, each
of which is sorted individually.

It accomplishes this by adding a new exec node, PartialSortNode.
When PartialSortNode::GetNext() is called, it retrieves input
up to its memory limit, uses the existing Sorter class to sort
it, and outputs it. This is faster than a total sort with SortNode
as it avoids the need to spill if the input is larger than the
memory limit.

In the planner, the SortNode plan node is used, with an enum value
indicating if it is a total or partial sort.

As a first use case, partial sort is used where a total sort was
used previously for inserts into Kudu.

This patch is a work in progress, and needs to be polished and tested.

Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38
---
M be/src/exec/CMakeLists.txt
M be/src/exec/exec-node.cc
A be/src/exec/partial-sort-node.cc
A be/src/exec/partial-sort-node.h
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
9 files changed, 328 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/7267/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <tmarshall@cloudera.com>

Mime
View raw message