impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5788: Fix agg node crash when grouping by nondeterministic exprs
Date Fri, 18 Aug 2017 16:58:46 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5788: Fix agg node crash when grouping by nondeterministic exprs
......................................................................


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/7714/1/be/src/exec/partitioned-aggregation-node.cc
File be/src/exec/partitioned-aggregation-node.cc:

PS1, Line 1389: isRebuiltSpilledPartition
wrong casing for vars in C++


Line 1419:     for (int i = 0; i < PARTITION_FANOUT; ++i) hash_tbls_[i] = NULL;
How about something like this that works for both cases:

  for (int i = 0; i < PARTITION_FANOUT; ++i) {
    if (hash_partitions_[i] == hash_partitions_[partition_idx]) hash_tbls_[i] = nullptr;
  }

This avoids the need for the isRebuiltSpilledPartition logic.


Line 1431:   for (int i = 0; i < hash_partitions_.size(); ++i) {
I think we need to check other places that iterate over hash_partitions_ to make sure they
handle duplicate partitions correctly. E.g. this function doesn't do the right thing. I don't
know if it just works by coincidence or what.


http://gerrit.cloudera.org:8080/#/c/7714/1/be/src/exec/partitioned-aggregation-node.h
File be/src/exec/partitioned-aggregation-node.h:

Line 337:   /// Current partitions we are partitioning into.
Can you update the comment for hash_partitions_ too to explain the two cases (distinct partitions
versus all pointing to the same one).


http://gerrit.cloudera.org:8080/#/c/7714/1/testdata/workloads/functional-query/queries/QueryTest/spilling.test
File testdata/workloads/functional-query/queries/QueryTest/spilling.test:

Line 349: set default_spillable_buffer_size=64k;
I don't think this query option will take effect, since it's set by the test_spilling.py code.
This is a weird glitch in the test infra, but what happens is that these options are set in
the session on the Impala daemon, then they're overridden by the query options sent along
with the query.

We should remove this "set" and confirm that the test reproduces the crash with the alternative
buffer size.


PS1, Line 354: ;
Don't need trailing semicolon.


-- 
To view, visit http://gerrit.cloudera.org:8080/7714
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibdb09239577b3f0a19d710b0d148e882b0b73e23
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <bikramjeet.vig@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message