hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashutosh Chauhan" <hashut...@apache.org>
Subject Re: Review Request 24770: Improve the columns stats update speed
Date Tue, 19 Aug 2014 02:03:57 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24770/#review50948
-----------------------------------------------------------



metastore/if/hive_metastore.thrift
<https://reviews.apache.org/r/24770/#comment88836>

    Do you really need dbName, tblName in this struct definition? Since, ColumnStatistics
already contain this (via ColumnStatisticsDesc), this is repetition of info.
    Further, since ColumnStatistics contain list of ColStatsObj, I dont think, you need List<ColStats>
either. Seems like:
    
    struct SetPartitionsStatsRequest {
    1: required ColumnStatistics
    }
    
    is all you need.
    
    Further, actually it seems like current api update_partition_column_statistics() already
has this signature, so you probably dont need a new api at all.



metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
<https://reviews.apache.org/r/24770/#comment88838>

    seems like no one is using this method.



metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
<https://reviews.apache.org/r/24770/#comment88839>

    Can you add some comments here? What this method is doing?



metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
<https://reviews.apache.org/r/24770/#comment88840>

    If fqr.get(0) == null, why is it OK to have csid = 1?



metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
<https://reviews.apache.org/r/24770/#comment88842>

    Use extractSqlLong() method for this.



metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
<https://reviews.apache.org/r/24770/#comment88841>

    We want to avoid doing O(# of columns) queries on RDBMS, if possible. Reconsider this
for loop.



metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
<https://reviews.apache.org/r/24770/#comment88844>

    Look in TxnHandler.java where we do mass upsert queries, without doing delete followed
by insert. That logic looks more cleaner to me.



ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java
<https://reviews.apache.org/r/24770/#comment88837>

    Why only first row? Each row corresponds to a partition, we want to persists for all partitions
in one call, isnt it? And the way its written right now, it will actually ignore all but first
partition. I think this is the reason for failing test of colstats_part_lvl in Hive QA run.


Can you add some .q tests for this, where we do analyze statement first for table with no
stats and than with stats.

- Ashutosh Chauhan


On Aug. 16, 2014, 6:37 p.m., pengcheng xiong wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24770/
> -----------------------------------------------------------
> 
> (Updated Aug. 16, 2014, 6:37 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> improve the columns stats update speed for all the partitions of a table
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7f4afd9 
>   metastore/if/hive_metastore.thrift cb326f4 
>   metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 0e328dd 
>   metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 23b5edf 
>   metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp 4bcb2e6 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h 9f583a4 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 4768128 
>   metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AddPartitionsRequest.java
96caab6 
>   metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AddPartitionsResult.java
ba65da6 
>   metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/DropPartitionsResult.java
87444d2 
>   metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Function.java
813d5f5 
>   metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/GetOpenTxnsInfoResponse.java
5d3bf75 
>   metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/GetOpenTxnsResponse.java
b938d7d 
>   metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/HeartbeatTxnRangeResponse.java
49f4e56 
>   metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/LockRequest.java
f860028 
>   metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/OpenTxnsResponse.java
1a99948 
>   metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/RequestPartsSpec.java
217a3c1 
>   metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/SetPartitionsStatsRequest.java
PRE-CREATION 
>   metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ShowCompactResponse.java
6da732e 
>   metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ShowLocksResponse.java
554601a 
>   metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java
fb06b6c 
>   metastore/src/gen/thrift/gen-php/metastore/ThriftHiveMetastore.php 653b60c 
>   metastore/src/gen/thrift/gen-php/metastore/Types.php 6cdffd5 
>   metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote e430c77 
>   metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 450018b 
>   metastore/src/gen/thrift/gen-py/hive_metastore/ttypes.py e13243b 
>   metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb bd05eba 
>   metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 74964b4 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 84ef5f9 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 237166e

>   metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 143d1c7 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 767cffc

>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java dbb7d37 
>   metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 0364385 
>   metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
4eba2b0 
>   metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
78ab19a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java 94afaba 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 6f225f3 
> 
> Diff: https://reviews.apache.org/r/24770/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message