impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4674: Part 3: fix null-aware anti join
Date Thu, 13 Jul 2017 21:59:21 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4674: Part 3: fix null-aware anti join
......................................................................


Patch Set 1:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/7367/1//COMMIT_MSG
Commit Message:

PS1, Line 13: note
not


Line 25:   Instead we just iterate over the rows of the stream.
> are there join class comments that should be updated to explain this strate
I updated the member declarations to mentioned when things are pinned/unpinned and updated
the EvaluateNullProbe() comment. I think this covers all of the points in the commit message.


http://gerrit.cloudera.org:8080/#/c/7367/1/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

Line 251:     RETURN_IF_ERROR(null_aware_partition_->Spill(BufferedTupleStreamV2::UNPIN_ALL));
> it's a bit confusing that we call Partition::Spill() when Partition::is_spi
Improved the comment.


http://gerrit.cloudera.org:8080/#/c/7367/1/be/src/exec/partitioned-hash-join-node.h
File be/src/exec/partitioned-hash-join-node.h:

PS1, Line 95: /// Null aware anti-join (NAAJ) extends the above algorithm by accumulating
rows with
            : /// NULLs into several different streams, which are processed in a separate
step to
            : /// produce additional output rows. The NAAJ algorithm is documented in more
detail in
            : /// header comments for the null aware functions and data structures.
> any of this need updates (or the comments this references)?
Done.

It would be nice to have a clearer explanation of the NAAJ here (i.e. why NULLs should be
treated that wasy) but that seems out of scope for this change.


http://gerrit.cloudera.org:8080/#/c/7367/1/testdata/workloads/functional-query/queries/QueryTest/spilling-naaj.test
File testdata/workloads/functional-query/queries/QueryTest/spilling-naaj.test:

Line 3: set max_block_mgr_memory=10m;
Moved below comment


Line 8: # This returns the rows returned from as
garbled


Line 17: where l_suppkey = 4162 and l_shipmode = 'AIR' and l_returnflag = 'A' and l_shipdate
> '1993-01-01' and
Fixed long lines in files


-- 
To view, visit http://gerrit.cloudera.org:8080/7367
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie2e60eb4dd32bd287a31479a6232400df65964c1
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message