hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sushanth Sowmyan (JIRA)" <>
Subject [jira] [Created] (HIVE-12083) HIVE-10965 introduces thrift error if partNames or colNames are empty
Date Fri, 09 Oct 2015 22:46:05 GMT
Sushanth Sowmyan created HIVE-12083:

             Summary: HIVE-10965 introduces thrift error if partNames or colNames are empty
                 Key: HIVE-12083
             Project: Hive
          Issue Type: Bug
            Reporter: Sushanth Sowmyan
            Assignee: Sushanth Sowmyan

In the fix for HIVE-10965, there is a short-circuit path that causes an empty AggrStats object
to be returned if partNames is empty or colNames is empty:

diff --git metastore/src/java/org/apache/hadoop/hive/metastore/ metastore/src/java/org/apache/hadoop/hive/metastore/
index 0a56bac..ed810d2 100644
--- metastore/src/java/org/apache/hadoop/hive/metastore/
+++ metastore/src/java/org/apache/hadoop/hive/metastore/
@@ -1100,6 +1100,7 @@ public ColumnStatistics getTableStats(
   public AggrStats aggrColStatsForPartitions(String dbName, String tableName,
       List<String> partNames, List<String> colNames, boolean useDensityFunctionForNDVEstimation)
       throws MetaException {
+    if (colNames.isEmpty() || partNames.isEmpty()) return new AggrStats(); // Nothing to
     long partsFound = partsFoundForPartitions(dbName, tableName, partNames, colNames);
     List<ColumnStatisticsObj> colStatsList;
     // Try to read from the cache first

This runs afoul of thrift requirements that AggrStats have required fields:

struct AggrStats {
1: required list<ColumnStatisticsObj> colStats,
2: required i64 partsFound // number of partitions for which stats were found

Thus, we get errors as follows:

2015-10-08 00:00:25,413 ERROR server.TThreadPoolServer ( -
Thrift error occurred during processing of message.
org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is unset! Struct:AggrStats(colStats:null,
        at org.apache.hadoop.hive.metastore.api.AggrStats.validate(
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(
        at org.apache.thrift.ProcessFunction.process(
        at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$
        at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$
        at Method)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(
        at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(
        at org.apache.thrift.server.TThreadPoolServer$
        at java.util.concurrent.ThreadPoolExecutor.runWorker(
        at java.util.concurrent.ThreadPoolExecutor$

Normally, this would not occur since HIVE-10965 does also include a guard on the client-side
for colNames.isEmpty() to not call the metastore call at all, but there is no guard for partNames
being empty, and would still cause an error on the metastore side if the thrift call were
called directly, as would happen if the client is from an odler version before this was patched.

This message was sent by Atlassian JIRA

View raw message