impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3567 Part 2, IMPALA-3899: factor out PHJ builder
Date Thu, 15 Sep 2016 18:07:09 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3567 Part 2, IMPALA-3899: factor out PHJ builder
......................................................................


Patch Set 14:

(2 comments)

Before I responded to individual comments I wanted to point out that the coupling is now one
way - the builder never refers to PhjNode aside from metadata provided via the constructor
(you can verify there are no references to partitioned-hash-join-node.h or PartitionedHashJoin
node in the builder code).

At this point it should only require minor changes to this hash join code to create the builder
in a separate fragment *before* the join node, provided you have a 1:1 mapping from builder
to node. 1:n broadcast joins with no spilling would require some more changes, but is a lot
closer.

I chose this stopping point because there was a clear physical separation between the classes,
even if the spilling algorithm has some logical coupling and assumes a 1:1 mapping from node
to builder. 

Reducing the logical coupling would require significant changes to the spilling algorithm
that is driven from PartitionedHashJoin::GetNext(), which I think should be deferred until
there is some consensus about what spilling for 1:n broadcast joins with multithreading might
look like.

http://gerrit.cloudera.org:8080/#/c/3873/14/be/src/exec/partitioned-hash-join-builder.h
File be/src/exec/partitioned-hash-join-builder.h:

Line 123:   bool HashTableStoresNulls() const;
> why is this here rather than the join node? especially given that the join 
Both classes already have their own hash table context.


Line 128:   inline const std::vector<bool>& is_not_distinct_from() const {
> and then should this be here? is it to be closer to being able to break the
There isn't a parent link - all the necessary metadata about the join is passed in when the
builder is initialised.


-- 
To view, visit http://gerrit.cloudera.org:8080/3873
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1e02ea9c7a7d1a0f373b11aa06c3237e1c7bd4cb
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message