impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Tauber-Marshall (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-5498: Support for partial sorts
Date Mon, 10 Jul 2017 22:35:41 GMT
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-5498: Support for partial sorts

Patch Set 4:

File be/src/exec/

Line 32:     sorter_(NULL),
> This doesn't need to handle offset?

Line 127: 
> Since this only requires a partial sort, you can exit early when the cumula
As noted elsewhere, removing limit-related code.

Line 140:     *eos = input_eos_;
> I guess it is not clear to me why allowing a limit without an offset is use
Right, so 'offset' never really makes sense with partial-sort-node, since as you say there's
not a total ordering and you're not guaranteed that one instance of partial-sort-node with
an offset can pick up where another instance left off with a limit.

A limit is potentially useful just for cutting off the number of rows returned in a single
partial-sort-node, but with the initial use case for this there's no way to end up with a
partial-sort-node with a limit, so in the interest of keeping things simple and not having
untested code paths, I think I'll just remove the limit code.
File be/src/exec/partial-sort-node.h:

Line 58:   virtual Status QueryMaintenance(RuntimeState* state);
> Just for general niceness, can you also update be/src/exec/sort-node.h's pr
File be/src/runtime/

Line 1384:       expr_mem_pool, &sort_tuple_expr_evals_));
> Need to update this calculation for partial sorts.

Line 1434:   if (sorted_runs_.size() == 1) {
> Unnecessary - this is unconditionally dereferenced
File be/src/runtime/sorter.h:

Line 96:   Sorter(const TupleRowComparator& compare_less_than,
> Wasn't the plan to apply the partition sort memory limit to the sorter itse
This will be included in a follow up patch.
File common/thrift/PlanNodes.thrift:

PS3, Line 349: e {
> Should this be "first" since the order is specified elsewhere.
My problem with that wording is that it could be interpreted as "take the first N elements
of input and then sort them"

Maybe: "Return the first N sorted elements"?

Line 360: struct TSortNode {
> Please ignore my earlier comments about offset handling (although a comment
Added to partial-sort-node.h
File fe/src/main/java/org/apache/impala/planner/

Line 58:   private final long PARTIAL_SORT_MEM_LIMIT = 128 * 1024 * 1024;
> It might be clearer to specify this in bytes, then just make sure that its 

Line 286:       // The memory limit cannot be less than the size of the required blocks.
> This looks good to me.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Matthew Jacobs <>
Gerrit-Reviewer: Thomas Tauber-Marshall <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-Reviewer: Zach Amsden <>
Gerrit-HasComments: Yes

View raw message