impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vuk Ercegovac (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5310: Add COMPUTE STATS TABLESAMPLE.
Date Wed, 27 Sep 2017 06:54:45 GMT
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8136 )

Change subject: IMPALA-5310: Add COMPUTE STATS TABLESAMPLE.
......................................................................


Patch Set 1:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/8136/1/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
File fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java:

http://gerrit.cloudera.org:8080/#/c/8136/1/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@55
PS1, Line 55: >)]
I was expecting to see a mention of REPEATABLE here (per change description).


http://gerrit.cloudera.org:8080/#/c/8136/1/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@58
PS1, Line 58: partition-level row counts are not modified.
does this allow for the state of table vs. partition stats to be inconsistent (e.g., combining
per-partition stats may not add up to the table level stats)?


http://gerrit.cloudera.org:8080/#/c/8136/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/8136/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@161
PS1, Line 161: .
pls clarify if these should never be valid at the same time.


http://gerrit.cloudera.org:8080/#/c/8136/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@184
PS1, Line 184: partition
nit: partitions


http://gerrit.cloudera.org:8080/#/c/8136/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@780
PS1, Line 780: cardinality_
it would be clearer if this is named outputCardinality_ (the comment for this method refers
to it this way, so there's already some drift).


http://gerrit.cloudera.org:8080/#/c/8136/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@830
PS1, Line 830: getNumRows
used at 4 call sites in this for-loop. assign it to a variable.
will also make things more consistent with the non-partitioned case at L845.


http://gerrit.cloudera.org:8080/#/c/8136/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1122
PS1, Line 1122: tbl_.getNumClusteringCols() > 0
this is used in several places-- it would be clearer if it was wrapped in something like "isPartitioned()"


http://gerrit.cloudera.org:8080/#/c/8136/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/8136/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@744
PS1, Line 744: Impala
got confused here since previous line mentions Hive and Impala tables (why should the partitions
be Impala partitions in all cases?)


http://gerrit.cloudera.org:8080/#/c/8136/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@795
PS1, Line 795: Hive table
given previous comment referencing Hive and Impala tables, I'm unclear about what is specific
to Hive tables here.


http://gerrit.cloudera.org:8080/#/c/8136/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@860
PS1, Line 860: totalFileBytes / sampleFileBytes
is returned result guaranteed to be >= val?



-- 
To view, visit http://gerrit.cloudera.org:8080/8136
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f3e72471ac563adada4a4156033a85852b7c8b7
Gerrit-Change-Number: 8136
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <vercegovac@cloudera.com>
Gerrit-Comment-Date: Wed, 27 Sep 2017 06:54:45 +0000
Gerrit-HasComments: Yes

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message