hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ning Zhang" <nzh...@fb.com>
Subject Re: Review Request: HIVE-2050. batch processing partition pruning process
Date Mon, 28 Mar 2011 05:59:19 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/522/
-----------------------------------------------------------

(Updated 2011-03-27 22:59:19.075996)


Review request for hive.


Changes
-------

There are 2 major changes from the last patch:
 - added a parameter hive.metastore.batch.retrieve.max to control the maximum number of partitions
can be retrieved from the metastore in one batch (default 300). In Hive.getPartitionsByNames(),
the input partition name list are separated into sublists and call the metastore API for each
sublist.
 - one of the most time consuming DB operations is the retrieve the sub-classes of MPartition.
In particular the list of FieldSchema are retrieved for each partition and they are never
used (the table's field schema is used for all partitions). So one of the changes here is
to omit the retrieval of FieldSchema and make the table's fieldschema as the partitions. If
later we need the partition's fieldschema for schema evaluation, we should add another function/flag
for that. 

These changes reduce memory by 50% and CPU by 20%. 


Summary
-------

Introducing a new metastore API to retrieve a list of partitions in batch. 


Diffs (updated)
-----

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1085555 
  trunk/conf/hive-default.xml 1085555 
  trunk/metastore/if/hive_metastore.thrift 1085555 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 1085555 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 1085555

  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 1085555

  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 1085555 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 1085555 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 1085555 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1085555 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java 1085555 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartExprEvalUtils.java 1085555

  trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java 1085555 

Diff: https://reviews.apache.org/r/522/diff


Testing
-------


Thanks,

Ning


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message