impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dimitris Tsirogiannis (Code Review)" <ger...@cloudera.org>
Subject [Impala-CR](cdh5-2.6.0_5.8.0) IMPALA-3369: Add ALTER TABLE SET COLUMN STATS statement.
Date Tue, 24 May 2016 21:48:41 GMT
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-3369: Add ALTER TABLE SET COLUMN STATS statement.
......................................................................


Patch Set 4:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/3189/4/fe/src/main/java/com/cloudera/impala/analysis/AlterTableSetColumnStats.java
File fe/src/main/java/com/cloudera/impala/analysis/AlterTableSetColumnStats.java:

Line 35: 'numDVs'='100','numNulls'='10','avgSize'='5.4','maxSize'='50'
nit: I think I prefer the generic description you have in the commit msg


Line 44: public final static String NDV_KEY = "numDVs";
       :   public final static String NUM_NULLS_KEY = "numNulls";
       :   public final static String AVG_SIZE_KEY = "avgSize";
       :   public final static String MAX_SIZE_KEY = "maxSize";
Don't you think these fields and the logic in isValidStatsKey() and setStatsValue() fns belong
to the ColumnStats class?


Line 93: if (!isValidStatsKey(entry.getKey())) {
       :         throw new AnalysisException(String.format(
       :             "Invalid column stats key: %s\nValid keys are: %s",
       :             entry.getKey(), Joiner.on(',').join(ALL_STATS_KEYS)));
       :       }
I think you can push this check in L165. Actually, if you do that, you probably don't need
the isValidStatsKey function.


Line 87: // Copy the existing stats and then change the values according to the
       :     // stats map of this stmt. The existing stats are first copied to preserve
       :     // those stats values that are not changed by this stmt because all stats
       :     // values are updated when altering the stats in the HMS.
       :     colStats_ = col.getStats().clone();
       :     for (Map.Entry<String, String> entry: statsMap_.entrySet()) {
       :       if (!isValidStatsKey(entry.getKey())) {
       :         throw new AnalysisException(String.format(
       :             "Invalid column stats key: %s\nValid keys are: %s",
       :             entry.getKey(), Joiner.on(',').join(ALL_STATS_KEYS)));
       :       }
       :       setStatsValue(entry.getKey(), entry.getValue(), col, colStats_);
       :     }
Same comment as above. This looks like something that belongs to ColumnStats.analyze(). What
do you think?


Line 120: equalsIgnoreCase(
Instead of having all the equalsIgnoreCase() calls it may be more robust if your store ALL_STATS_KEYS
in lower case and then call the statsKey.toLowerCase() once. It's easy to miss a call to ignore
case if we ever modify this code again :)


Line 134: catch (Exception e) {
        :       }
Alternatively, you can wrap the setStatsValue in a try catch that catches only NumberFormatException
and converts it into an proper AnalysisException instead of having the empty catch here and
in L153. In this function you can still handle logical errors in setting column stat values.


Line 176: Map<String, TColumnStats> columnStats = Maps.newHashMap();
        :    columnStats.put(colName_.toString(), colStats_.toThrift());
        :    updateStatsParams.setColumn_stats(columnStats);
I believe thrift will generate a putToColumn_stats() function that you can use instead of
L176-178.


http://gerrit.cloudera.org:8080/#/c/3189/4/fe/src/main/java/com/cloudera/impala/analysis/AlterTableStmt.java
File fe/src/main/java/com/cloudera/impala/analysis/AlterTableStmt.java:

Line 70: table_ instanceof KuduTable
       :         && !(this instanceof AlterTableSetTblProperties)
       :         && !(this instanceof AlterTableSetColumnStats)
       :         && !(this instanceof AlterTableOrViewRenameStmt
I wonder if we should have some kind of isSupportedStmt() function that gets overridden by
each table type and has the appropriate checks.


http://gerrit.cloudera.org:8080/#/c/3189/4/fe/src/main/java/com/cloudera/impala/catalog/KuduTable.java
File fe/src/main/java/com/cloudera/impala/catalog/KuduTable.java:

Line 24: import org.apache.hadoop.hive.metastore.HiveMetaStoreClient;
       : import org.apache.hadoop.hive.metastore.api.FieldSchema;
       : import org.apache.hadoop.hive.metastore.api.hive_metastoreConstants;
       : import org.apache.log4j.Logger;
       : import org.kududb.client.KuduClient;
       : import org.kududb.client.LocatedTablet;
Out of curiosity, why did you move them here?


http://gerrit.cloudera.org:8080/#/c/3189/4/fe/src/test/java/com/cloudera/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/com/cloudera/impala/analysis/AnalyzeDDLTest.java:

Line 598: 
For completion you may want to add tests where the keys are in upper case, lower case, camel-case,
etc.


Line 620: 
Also add a case where the same key is specified twice ("numNulls'='2', 'numNulls'=3). Who
would have thought of such a thing :)


http://gerrit.cloudera.org:8080/#/c/3189/4/tests/metadata/test_ddl.py
File tests/metadata/test_ddl.py:

Line 324: _create_db
Shouldn't we be using the unique db fixture?


-- 
To view, visit http://gerrit.cloudera.org:8080/3189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I45cd8aa7241ea962788ba9ca7d0bbfd864c4304f
Gerrit-PatchSet: 4
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.6.0_5.8.0
Gerrit-Owner: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bharathv@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message