impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-3567 Part 2, IMPALA-3899: factor out PHJ builder
Date Thu, 22 Sep 2016 00:51:12 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3567 Part 2, IMPALA-3899: factor out PHJ builder

Patch Set 18:

File be/src/exec/

PS18, Line 326: we'd like all partitions to
> ... all partitions ..

PS18, Line 327: be
> are 

Line 353:   // building hash tables because allocating probe buffers may cause some more partitions
> duplicated sentences

PS18, Line 797: filters_.size() > 0)
> this doesn't seem to exactly match the logic at line 166 (that doesn't care
I simplified the code around line 166 by calculating 
bool build_filters; outside of the branches. Should be equivalent and easier to reason about.
File be/src/exec/partitioned-hash-join-builder.h:

Line 276:   /// Returns non-ok status if we couldn't spill a partition.
> nit: weird line breaking.

PS18, Line 280: should have all of their build rows
> what is this trying to say? that we're done partitioning the build?
Yeah, clarified in terms of FlushFinal().

PS18, Line 286: no buffers in either the build or probe stream.
> clarify whether this means neither has buffers or at least one does not hav

PS18, Line 287: in_mem
> are these states actually represented in code? (if not, why the underscore?

PS18, Line 291: probe_buffers_for_spilled_partitions_
> what's that?
Ah, stale comment from earlier version of the code. Updated it.

PS18, Line 301: probe_buffers_for_spilled_partitions_
> same
Ah, this method doesn't even exist and more. Removed up updated the CloseAndDeletePartitions()
comment (since it does that work now).;

PS18, Line 349: PartitionedHashJoinNode
> who owns it?
Clarified. Also added a TODO to indicate this will need to be updated for the spilling broadcast
join case when we come to design a solution for that.

PS18, Line 427: so that the
              :   /// initial probe buffer is allocated
> this doesn't really explain why the builder has to do it.  Would be good to

Line 432:   /// phase of the algorithm without spilling more partitions.
> this hints at it, so maybe move this up and reword to be more explicit.
File be/src/exec/

PS18, Line 285: *status
> if we can now do this, can we also just make NextProbeRow() and others retu
IIRC I started to do that refactoring at one point but quickly realised that it didn't work.

NextProbeRow() returns false both when status is an error and when it does not have a next
row to move to.

Line 293:       if (LIKELY(hash_tbl != NULL)) {
> is probe_partition NULL in this case?
Yes it is. I added additional DCHECKs to PrepareForProbe() to validate the consistency of
the partition state. I think adding additional DCHECKs here would obscure the logic somewhat.
I thought it was more helpful below since the logic relied more directly on the correlation
between the different partitions.
File be/src/exec/

Line 47:     "the memory limit may help this query to complete successfully.";
> weird line breakages
File be/src/exec/partitioned-hash-join-node.h:

Line 404:   /// builder_->hash_partitions. This is non-empty only in the PARTITIONING_PROBE
> underscore, right?
I guess the accessor is hash_partitions() so changed it to that.

PS18, Line 406: a
> and?

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I1e02ea9c7a7d1a0f373b11aa06c3237e1c7bd4cb
Gerrit-PatchSet: 18
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: Yes

View raw message