impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-2581: LIMIT can be propagated down into some aggregations
Date Tue, 02 Aug 2016 00:00:15 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-2581: LIMIT can be propagated down into some aggregations

Patch Set 1:


I thought some more about this during the day and I think there are some subtle  issues around
spilling and passthrough aggs with limits that we don't have test coverage for.
File be/src/exec/

Line 130:   if (group_count_ < max_groups_) {
> We can turn this into a one-liner without an else.
This will also give incorrect results for spilling aggregations. If AGGREGATED_ROWS is true,
this is processing a spilled row that was already aggregated, so will double-count that group.

I think we should add a test to that exercises this code path (e.g. a large
agg with a high limit).

Line 250:     if (group_count_ >= max_groups_) {
I think this should should actually occur before the remaining_capacity check. Otherwise we'll
pass through rows even when we already have enough groups (meaning the limit isn't correctly
enforced and we do unnecessary work).
File be/src/exec/

PS1, Line 506: ReachedLimit
Do we actually need these limit checks for grouping aggregations now? If not, we should remove
the redundant checks and document how the limit is enforced.

Line 519: Status PartitionedAggregationNode::GetRowsStreaming(RuntimeState* state,
We don't actually implement limit checks in GetRowsStreaming(), since previously limits were
never applied.

I think in some circumstances (e.g. if it switches into streaming mode) the current version
of the patch will return more rows than the limit. I think if you fix my other comment in this may not be the case, but it would be good to think through and document why the
explicit limit check isn't necessary (if it indeed isn't).

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I59c5b7af7a73ccdbc5496b28eacb9b6859d202bc
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: Yes

View raw message