impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Behm (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5310: Add COMPUTE STATS TABLESAMPLE.
Date Mon, 27 Nov 2017 22:19:20 GMT
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8136 )

Change subject: IMPALA-5310: Add COMPUTE STATS TABLESAMPLE.
......................................................................


Patch Set 2:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/8136/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
File fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java:

http://gerrit.cloudera.org:8080/#/c/8136/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@59
PS2, Line 59:  *   table-level column statistics. Existing partition-objects and their row
count not
> nit: are not
Done


http://gerrit.cloudera.org:8080/#/c/8136/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@338
PS2, Line 338: expectAllPartitions_ = false;
> I don't think you need that. I think it's already initialized to false.
Done


http://gerrit.cloudera.org:8080/#/c/8136/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@422
PS2, Line 422: expectAllPartitions_ = !(table_ instanceof HdfsTable) ||
             :           !BackendConfig.INSTANCE.enableStatsExtrapolation();
> I think there is a conflict between this line and the comment about expectA
Done


http://gerrit.cloudera.org:8080/#/c/8136/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@435
PS2, Line 435: // Tablesample clause to be used for all child queries.
             :     String tableSampleSql = analyzeTableSampleClause(analyzer);
> nit: move it closer to L452?
Done


http://gerrit.cloudera.org:8080/#/c/8136/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@440
PS2, Line 440: if (!updateTableStatsOnly()) {
             :       for (Column partCol: hdfsTable.getClusteringColumns()) {
             :         groupByCols.add(ToSqlUtils.getIdentSql(partCol.getName()));
             :       }
             :     }
> merge with L450?
Done


http://gerrit.cloudera.org:8080/#/c/8136/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@503
PS2, Line 503: Sets 'sampleFileBytes_' according
             :    * to the sample.
> I think it's important to stress that  this function computes the sample. I
Done


http://gerrit.cloudera.org:8080/#/c/8136/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@524
PS2, Line 524: Set total file bytes being scanned based on the sample.
> Maybe "Compute a sample of files to be scanned and set 'sampleFileBytes_'",
Done


http://gerrit.cloudera.org:8080/#/c/8136/2/fe/src/main/java/org/apache/impala/analysis/TableSampleClause.java
File fe/src/main/java/org/apache/impala/analysis/TableSampleClause.java:

http://gerrit.cloudera.org:8080/#/c/8136/2/fe/src/main/java/org/apache/impala/analysis/TableSampleClause.java@67
PS2, Line 67: Long
> nit: do you need an object here?
Yes. A null means don't print the REPEATABLE clause. Added comment.


http://gerrit.cloudera.org:8080/#/c/8136/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/8136/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@691
PS2, Line 691: Reference<Long> numUpdatedPartitions, Reference<Long> numUpdatedColumns
> Add a comment about these two.
Done


http://gerrit.cloudera.org:8080/#/c/8136/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@698
PS2, Line 698: if (LOG.isInfoEnabled()) {
> Does this mean that it won't print anything for debug and/or trace? Is ther
Here's a useful table of the log level hierarchy.
https://stackoverflow.com/questions/7745885/log4j-logging-hierarchy-order

INFO is printed for levels INFO, DEBUG, TRACE, and ALL.


http://gerrit.cloudera.org:8080/#/c/8136/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@804
PS2, Line 804: Hive
> I think we should start calling these HMS tables/columns. Besides, soon HMS
Wfm. Done.


http://gerrit.cloudera.org:8080/#/c/8136/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@870
PS2, Line 870: Preconditions.checkState(val >= 0);
             :     Preconditions.checkState(sampleFileBytes >= 0);
             :     Preconditions.checkState(totalFileBytes >= 0);
> nit: merge into one statement?
Done


http://gerrit.cloudera.org:8080/#/c/8136/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@875
PS2, Line 875: return Math.round(val * mult);
> Alternatively, you can use LongMath.checkedMultiply(), catch the arithmetic
This is a double multiplication. Added comment to point put the round() behavior.



-- 
To view, visit http://gerrit.cloudera.org:8080/8136
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f3e72471ac563adada4a4156033a85852b7c8b7
Gerrit-Change-Number: 8136
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Balazs Jeszenszky <jeszyb@gmail.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <vercegovac@cloudera.com>
Gerrit-Comment-Date: Mon, 27 Nov 2017 22:19:20 +0000
Gerrit-HasComments: Yes

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message