impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-3748: Part 1: Clean up resource estimation in planner
Date Tue, 07 Feb 2017 22:39:18 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3748: Part 1: Clean up resource estimation in planner

Patch Set 3:

Commit Message:

Line 28: * Many operators do not have estimates or have questionable heuristics.
> are you planning on addressing this particular point in a follow-up? the es
I'm not planning to try to fix the per-plan-node estimation - I just wanted to add test coverage
and do this cleanup before trying to build more on top of this. We don't need the memory estimates
to compute minimum reservation. The minimum reservation could provide a floor for the memory
estimate, though.

The important thing in this patch for the forthcoming min reservation calculation is how the
node sets are computed, since that is also needed for min memory estimates. This patch adds
test coverage for that logic in advance of adding the reservation calculations.

If we think there is value in fixing the memory estimates we should treat that as a separate
issue but it doesn't block making progress on the buffer pool minimum estimates. As you mentioned
there are potential downsides to increasing memory estimates, even if it improves the accuracy
of per-node estimates, because an underestimate could be offsetting an overestimate elsewhere.
Although memory-estimate-based admission control is fairly unusable as-is.
File fe/src/main/java/org/apache/impala/planner/

Line 249:     perHostMemCost_ = 0L;
> does this make any difference in java?
Spurious change -reverted
File fe/src/main/java/org/apache/impala/planner/

Line 69:    * Computes the estimated per-host memory and CPU requirements of this plan node
> remove references to cpu requirements

Line 105:         perHostHdfsScanMem += node.getPerHostMemCost();
> this is incorrect for mt execution, because it extrapolates the number of e

Line 116:         Math.min(perHostHbaseScanMem, HBaseScanNode.getPerHostMemUpperBound());
> same here
The HBase estimates are purely placeholders. It assumes that it consumes 1GB per scan, but
max 1GB per query across all scans. Also it assumes 100 nodes. It's not clear to me what the
"right" answer is or whether it's worth trying to figure that out. My preference is to not
touch it since I don't know what the "correct" fix is.

Line 123:       sink.computeCosts();
> this is also wrong, because it doesn't take sorting into account.
See my comment on the commit message - I don't want to start doing piecemeal fixes to the
estimates for all the plan node types. I'd rather get this cleanup in plus the test coverage,
then that can be the base for doing the buffer pool and any memory estimate fixes.

Line 197:     // TODO: can overestimate resource consumption of UnionNodes - the execution
of union
> this needs to get fixed. see earlier comment.
The tricky part is that the branches of the union within the current fragment are serialised
but anything below an exchange runs concurrently. 

I think this would be throwaway work if I did it now since IMPALA-4862 requires significant
changes to the lhsSet/rhsSet logic. I updated IMPALA-4862 to include the union work, since
I think they'd be easiest to tackle together.

Line 205:    * Estimate of the peak resources that will be consumed by a set of plan nodes.
> move above where it's used

Line 209:     private final boolean valid_;
> prefixing boolean variables with 'is-" is helpful

Line 214:       this.valid_ = valid;
> remove this_
File fe/src/main/java/org/apache/impala/planner/

Line 344:    * Estimates the per-host memory and CPU requirements for the given plan fragments,
> remove references to cpu requirements (here and elsewhere)

Line 375:     if (isTrivialCoordOnlyPlan(fragments)) {
> check early?

Line 381:     // excludeUnpartitionedFragments is true.
> mysterious
It doesn't make sense to me. I looked at the review that originally added this code and there
wasn't any real explanation. I guess the idea is that if we're going to multiply the resource
estimate by number of hosts, it's more accurate to exclude unpartitioned fragments?

I can't see a valid reason to exclude unpartitioned fragments, so I removed this flag. As
part of the MT changes I still factor in the partitioned/unpartitioned fragments since it
affects the number of instances per host and therefore the per-host memory consumption

PS3, Line 395: || maxPerHostMemBytes == Long.MIN_VALUE
This was redundant.
File fe/src/test/java/org/apache/impala/planner/

Line 369:     options.setNum_scanner_threads(1); // Required so that output doesn't vary by
> please also add tests for mt execution
Added PARALLELPLANS to this test.

I ended up significantly reworking the resource estimation so that it was more oriented around
instances - it conflated hosts and fragment instances in various ways.
File fe/src/test/java/org/apache/impala/planner/

Line 413:             ignoreExplainHeader, errorLog, actualOutput);
> odd indentation

Line 420:             actualOutput);
> odd indentation, please fix throughout
File testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test:

Line 46: |  hosts=3 per-host-mem=160B
> we need to impose a meaningful minimum, such as the minimal mempool size
There are so many problems with these memory estimates that I think it's best to avoid piecemeal
changes like this until there's a clearer overall picture of how the memory estimates should
be derived.

Line 52: |  hosts=3 per-host-mem=128.00MB
> where did that come from?
It's the hardcoded default in AggregationNode that is used when stats are unavailable. Why
128MB? I don't know.
File testdata/workloads/functional-planner/queries/PlannerTest/resource-estimates.test:

Line 120: |  hosts=3 per-host-mem=10.00MB
> this should express the minimum required to operate at all, rather than "ho
The 10MB number comes from MIN_HASH_TBL_MEM in AggregationNode.

These memory estimates are the old memory estimates - I believe they were meant to approximate
the amount of memory required to run entirely in-memory. It's unclear to me what we want to
base estimates on going forward, but I think there will be value in future in having estimates
for the ideal memory.

I'm going to add minimum buffer requirements for spilling exec nodes in a follow-on patch
- maybe that's what you were looking for.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I1e358182bcf2bc5fe5c73883eb97878735b12d37
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <>
Gerrit-Reviewer: Marcel Kornacker <>
Gerrit-Reviewer: Matthew Jacobs <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: Yes

View raw message