impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Jacobs (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5498: Support for partial sorts
Date Fri, 14 Jul 2017 22:51:00 GMT
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5498: Support for partial sorts
......................................................................


Patch Set 6:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/7267/6//COMMIT_MSG
Commit Message:

Line 7: IMPALA-5498: Support for partial sorts
lets specify in Kudu for now


PS6, Line 28: A
nit: This also adds...


Line 32: used previously for inserts into Kudu.
"for inserts into Kudu"->"for DML statements into Kudu tables only. Future work can extend
this to other table sinks IMPALA-????"


Line 42:   Now: no spills, sort takes 6m19s, for ~18% speedup
comparison vs no partition/sort at all?


http://gerrit.cloudera.org:8080/#/c/7267/6/be/src/exec/partial-sort-node.h
File be/src/exec/partial-sort-node.h:

PS6, Line 62: sort_exec_exprs_
update


PS6, Line 62: derived based
awkward/confusing


http://gerrit.cloudera.org:8080/#/c/7267/6/be/src/runtime/sorter.cc
File be/src/runtime/sorter.cc:

PS6, Line 1383: RunSize
can you indicate the units, e.g. #rows -> NumRowsPerRun?


PS6, Line 1387:   if (no_spill_) {
              :     // 'min_buffers_required' is 1 since the sorter will only have 1 run at
a time, so
              :     // there won't be any merges.
              :     min_buffers_required = 1;
              :   } else {
              :     min_buffers_required = MIN_BUFFERS_PER_MERGE;
              :   }
nit: ternary operator


http://gerrit.cloudera.org:8080/#/c/7267/6/be/src/runtime/sorter.h
File be/src/runtime/sorter.h:

PS6, Line 45: Specifying 'no_spill' as true is
For this use case, 'no_spill' should be set to true so the sorter can reduce...


PS6, Line 103: bool no_spill = false
comment on this even though it's mentioned above


Line 117:   /// up, it is sorted and a new unsorted run is created.
comment that this may not be called if no_spill is true


PS6, Line 127: InputDone
comment that this may not be called if no_spill is true
(are there any other cases I missed?)


http://gerrit.cloudera.org:8080/#/c/7267/6/fe/src/main/java/org/apache/impala/planner/SortNode.java
File fe/src/main/java/org/apache/impala/planner/SortNode.java:

PS6, Line 56: TODO: run experiments to
            :   // determine the value for this, consider making it configurable, enforce
it in the BE.
JIRA?


PS6, Line 124: {
             :     return type_ != TSortType.PARTIAL;
             :   }
nit: oneline?


-- 
To view, visit http://gerrit.cloudera.org:8080/7267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokhtar@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Zach Amsden <zamsden@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message