impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcel Kornacker (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-3748: estimate minimum buffers in planner
Date Thu, 09 Mar 2017 03:51:21 GMT
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-3748: estimate minimum buffers in planner

Patch Set 8:

File fe/src/main/java/org/apache/impala/planner/

Line 48:   public class ResourceRequirement {
these aren't requirements if they include the existing "estimates". ResourceProfile?

it feels like this should be a standalone class, so it can be attached to plan nodes.

Line 115:       node.computeCosts(queryOptions);
they're not costs

Line 156:             hdfsScanMemEstimate + hbaseScanMemEstimate + nonScanMemEstimate + dataSinkMemEstimate);
long line
File fe/src/main/java/org/apache/impala/planner/

Line 209:   public int getNumInstancesPerHost(TQueryOptions queryOptions) {
passing in the dop directly is preferable, because it's easier to see what's going on and
it's narrower.

Line 232:   public long getNumDistinctValues(TQueryOptions queryOptions, List<Expr>
exprs) {
rename to include "per instance" in the name to clarify. you might also want to abbreviate
NumDistinctValues as Ndv to keep this manageable.
File fe/src/main/java/org/apache/impala/planner/

Line 120:   // estimated per-instance memory usage for this plan node in bytes;
the "estimates" don't really have defined semantics (max? desirable?). we might want to deprecate
this, and mark it as such, to decrease the amount of confusion generated by having multiple
values that look vaguely related.

also, "usage" still doesn't mean anything. i'm okay with simply marking this "obsolete".

Line 124:   // Minimum number of buffers required to execute spilling operator.
why restrict this to spilling? doesn't every operator have a defined requirement?

Line 281:       TQueryOptions queryOptions, TExplainLevel detailLevel) {
what's the role of the query options?

Line 317:           PrintUtils.printInstances(" ", fragment_.getNumInstances(queryOptions)));
why do we want this?

Line 322:             PrintUtils.printPerInstanceMinBuffers(" ", perInstanceMinBufferBytes_));
why make this conditional?
File testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test:

Line 10: |  hosts=1 instances=1 est-per-instance-mem=0B
how come the number of hosts changed?
File testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test:

Line 15:      hosts=1 instances=1 est-per-instance-mem=0B
regarding the number of fragment instances: that's really a per-fragment (rather than per
execnode) concept, so probably less noisy to print this at the fragment level.
File testdata/workloads/functional-query/queries/QueryTest/explain-level0.test:

Line 3
why change this?

Line 8: 'Per-Host Resource Estimates: Memory=388.41MB'
how come no resource requirements?
File testdata/workloads/functional-query/queries/QueryTest/explain-level3.test:

Line 8: 'Per-Host Resource Estimates: Memory=388.41MB'
how come this is missing the requirements?

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I1e358182bcf2bc5fe5c73883eb97878735b12d37
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <>
Gerrit-Reviewer: Marcel Kornacker <>
Gerrit-Reviewer: Matthew Jacobs <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: Yes

View raw message