impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Behm (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-4794: Grouping distinct agg plan robust to data skew
Date Mon, 14 Aug 2017 21:38:30 GMT
Alex Behm has posted comments on this change.

Change subject: IMPALA-4794: Grouping distinct agg plan robust to data skew

Patch Set 3:


Looks good, just minor polish of the commit msg.
Commit Message:

Line 11: plan partitions data between phase-1 and phase-2 by grouping expr and
the grouping exprs

Line 12: the data skewness on grouping expr directly impacts performance. The
make this statement about skew a separate sentence

Line 13: new plan partitions data by both grouping expr and distinct aggregation
by both the grouping and distinct agg exprs

Line 14: expr, then adds one more aggregation and exchange node. It is supposed
Try to avoid descriptions like "supposed to be". We should test and understand the impact
of changes clearly and state it with confidence. I suggest rephrasing to something like:

The new plan is more robust to data skew but does more work than the old plan.

Line 19: exchange node, followed by an additional merge aggregation and exchange
the first exchange node
File fe/src/main/java/org/apache/impala/planner/

Line 917: 
> I think that only works with grouping expr and is at line 922.
Not sure how I missed that line. You are right, of course.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bdada0e328b555900c7b7ff8aabc8eb15ae8fa9
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Tianyi Wang <>
Gerrit-HasComments: Yes

View raw message