impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-4023: don't attach buffered tuple streams to batches
Date Tue, 27 Sep 2016 18:18:57 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4023: don't attach buffered tuple streams to batches

Patch Set 2:

I agree that it could make a difference but I don't think we want to preserve the old behaviour.
I think we should be consistent between Blocks attached by the sorter and Blocks attached
by Join/Agg nodes. I looked at the history and the motivation seems to be in the join node:

    // We want to return as soon as we have attached a tuple stream to the out_batch
    // (before preparing a new partition). The attached tuple stream will be recycled
    // by the caller, freeing up more memory when we prepare the next partition.
    if (out_batch->AtCapacity()) break;

So it seems to be a heuristic to break out of the loop in the join node. I think if we want
to preserve that behaviour we should change the PHJ node to explicitly check whether there
were blocks attached to the batch, since that seems to be the actual intention. It's unclear
if it's worth adding that special-case to avoid accumulating a littl extra memory before we
flush out the batch.

It's somewhat broken anyway since there's nothing to prevent the other exec node from holding
onto the memory (e.g. if it's a NLJ node).

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I6471422f86ce71e4c6ab277a276000051bc2e8ff
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: No

View raw message