impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Behm (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-3586: Implement union passthrough
Date Wed, 15 Feb 2017 23:42:40 GMT
Alex Behm has posted comments on this change.

Change subject: IMPALA-3586: Implement union passthrough

Patch Set 7:

File be/src/exec/

Line 145:   DCHECK(child(0)->row_desc().IsPrefixOfEquivalent(row_desc()));
This should be IsPrefixOf() because we sanity checking the row descriptors of exec nodes (and
not row batches).
File be/src/exec/

Line 115:   // passthrough case, the child can be closed right away because the row batch
the child can be closed right away -> the child was already closed?

Line 116:   // from the child is copied (more details below).
accuracy: the row batch wasn't copied

Line 121:   if (child_idx_ < children_.size() && isChildPassThrough(child_idx_))
Suggest comment:
// Handle passthrough children. We pass 'row_batch' directly into the GetNext() call on the

Line 122:     // If the rows from the current child can be passed through, the physical row
This comment doesn't seem to add anything, I suggest removing it.

Line 131:     // It may be possible that the row batch is not empty, so we save the previous
More details would be helpful. If the batch has rows at this point, I'd imagine it can cause
all sorts of other problems. How can the batch already have rows?

Line 148:       // 'needs_deep_copy' lets us safely close the child in the next GetNext()
style: 'needs_deep_copy' is not a visible variable here, suggest just saying "Marking the
batch as needing a deep copy let's us ...

Line 154: 
DCHECK that child_idx_ is not passthrough here
File be/src/exec/union-node.h:

Line 67:   // Which children can be passed through, without being materialized.
... without evaluating and materializing their exprs.
File be/src/runtime/descriptors.h:

Line 412:   /// Return true if the physical layout of this descriptor matches the physical
matches that of other_desc

Line 413:   /// of other_desc, but not necessarily ids.
bot not necessarily the id.

Line 565:   /// of other_desc, but not necessarily ids.
but not necessarily the ids
File be/src/service/query-options.h:

Line 38:       TImpalaQueryOptions::DISABLE_UNION_PASSTHROUGH + 1);\
I tend to prefer ENABLE_UNION_PASSTHROUGH. To me positive phrasing is a little easier to understand.
What do you think?
File fe/src/main/java/org/apache/impala/analysis/

Line 227:   public boolean isEquivalent(SlotDescriptor other) {
Unfortunately, the term 'equivalent' already has a different meaning in the FE for slots,
so it would be good to the existing term fro this new one. Maybe isLayoutEquivalent()?
File fe/src/main/java/org/apache/impala/analysis/

Line 616:   public List<Expr> getUnionNodeResultExprs() {
File fe/src/main/java/org/apache/impala/planner/

Line 47:  * a child has an identical tuple layout as the output of the union node, the
... as the output of the union node, and the child only has naked SlotRefs as result exprs,
then the child is marked as 'passthrough'. The rows of passthrough children are directly returned
by the union node, instead of materializing the child's result exprs into new tuples.

Line 57:   protected final List<Expr> resultExprs_;
unionResultExprs_ to make distinguish it from the resultExprLists_ and such

Line 73:   // If false, no batches from child nodes would be passed through in the backend.
Comment should describe what this flag is. Also you mean "true" right?

Line 76:   // Indicates for which child nodes batches can be passed through in the backend.
remove "in the backend" (it's clear that execution happens there)

Line 81:   protected UnionNode(PlanNodeId id, TupleId tupleId) {
Is this c'tor still needed? If not, remove.

Line 89:                       List<Expr> resultExprs, boolean isInSubplan) {
indentation, unionResultExprs

Line 182:    * Compute whether we can pass through rows without materializing for the given
Can combine/simplify like this:

Returns true if rows from the child with 'childTupleIds' and 'childResultExprs' can be returned
directly by the union node (without materialization into a new tuple).

Might be good to list the conditions for passthrough compatibility.

Line 189:       Analyzer analyzer, List<TupleId> childTupleIds, List<Expr> childExprList)
childExprList -> childResultExprs

Line 190:     if (analyzer.getQueryOptions().isDisable_union_passthrough()) return false;
seems clearer to move this into init() and not execute any of the passthrough code

Line 193:     // TODO: Remove this as part of IMPALA-4179.
Move TODO to the class comment

Line 194:     if (isInSubplan_) return false;
same here, seems easier to move this check into init()

Line 205:     // Verify that the union node has one slot for every expression.
union node -> union tuple descriptor

Line 212:     if (resultExprs_.size() != childTupleDescriptor.getSlots().size()) return false;
I don't think this tricky check is correct because it won't allow passthrough for something

select int_col, int_col, int_col from functional.alltypes
union all
select int_col, int_col, int_col from functional.alltypes

Line 218:       SlotRef unionSlot = resultExprs_.get(i).unwrapSlotRef(false);
unionSlotRef, childSlotRef

Line 220:       if (!unionTupleDescriptor.getSlots().get(i).isMaterialized()) continue;
move above the unwrapSlotRef() calls

Line 221:       Preconditions.checkState(unionSlot.getDesc().getParent().getId().equals(tupleId_));
Don't think we need this check, but something like this would be good:


Line 223:         Preconditions.checkState(!(childSlot instanceof SlotRef));
No need for this check

Line 262:     // Compute which nodes can be passed through.
which child nodes

Line 266:       // Check that if the child outputs a single tuple, then it's not nullable.
move into computePassThrough
File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test:

Line 329: # IMPALA-3586: The operand with the Kudu scan cannot be passed through because it's
because id is not-nullable (primary key)

Line 346: select id from functional_kudu.alltypes
do select *
File testdata/workloads/functional-planner/queries/PlannerTest/union.test:

Line 3104: # IMPALA-3678: Both union operands should produce rows with non-nullable slots
remove "should"

Line 3124: # IMPALA-3678: The Union operands that contain a join should not be passed through,

Line 3184: select COUNT(c.c_custkey), COUNT(v.tot_price)
lowercase count for consistency
File testdata/workloads/functional-query/queries/QueryTest/union.test:

Line 1126: select COUNT(c.c_custkey), COUNT(v.tot_price)
lowercase count for consistency

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia8f6d5062724ba5b78174c3227a7a796d10d8416
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Taras Bobrovytsky <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: Yes

View raw message