hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From j.prasant...@gmail.com
Subject Re: Review Request 38702: HIVE-11553 use basic file metadata cache in ETLSplitStrategy-related paths
Date Thu, 24 Sep 2015 06:57:26 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38702/#review100363
-----------------------------------------------------------



metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java (line 5750)
<https://reviews.apache.org/r/38702/#comment157544>

    This is very hacky.
    I think reflection will be slower than ByteBuffer.put(). As put() uses intrinsic method
(HeapBB) or unsafe copy (DirectBB) both should be faster than reflect. Can you get rid of
reflection?



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 1508)
<https://reviews.apache.org/r/38702/#comment157550>

    Too many args for simple get/put pair.
    
    FileInfo[] getAll(Collection<HdfsFileStatusWithId> files);
    FileInfo get(HdfsFileStatusWithId file) (optional);
    
    void putAll(Collection<HdfsFileStatusWithId> files) (optional);
    void put(HdfsFileStatusWithId file, FileInfo, fileInfo);
    
    In fact, FooterCache interface can be generic.
    
    This will have all information for both cache implementation I guess. Right?



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 1588)
<https://reviews.apache.org/r/38702/#comment157551>

    initialize with files.size()?



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 1643)
<https://reviews.apache.org/r/38702/#comment157558>

    I am guessing the reason for getWithFastCheck is to avoid expensive compat check during
split generation for single query case. We cannot cache hiveconf object across queries as
configs may not be the same. If thats the case, why not store Hive object in a ThreadLocal?


- Prasanth_J


On Sept. 24, 2015, 1:03 a.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38702/
> -----------------------------------------------------------
> 
> (Updated Sept. 24, 2015, 1:03 a.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java f3e2168 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 815f499 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 6f15fd0

>   metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java e4a6cdb 
>   ql/pom.xml 36b3433 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 3511e73 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java 87881b6 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 2500fb6 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java cc03df7 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/ReaderImpl.java ab539c4 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 99896c6 
> 
> Diff: https://reviews.apache.org/r/38702/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message