hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From j.prasant...@gmail.com
Subject Re: Review Request 40415: HIVE-11675 make use of file footer PPD API in ETL strategy or separate strategy
Date Tue, 24 Nov 2015 23:29:46 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40415/#review107860
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 675)
<https://reviews.apache.org/r/40415/#comment167176>

    Can you make ppdResults are return value and not inOut parameter?



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 1083)
<https://reviews.apache.org/r/40415/#comment167177>

    Can this ever get large than 64MB? If so setSizeLimit should be increased in CodedInputStream.



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 1102)
<https://reviews.apache.org/r/40415/#comment167181>

    Can you make use of Guava's Range instead?



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 1123)
<https://reviews.apache.org/r/40415/#comment167182>

    Looks like current is being populated inside this method. Can this method be made to return
Range instead?



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 1826)
<https://reviews.apache.org/r/40415/#comment167183>

    Return ppdResult instead?



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 1917)
<https://reviews.apache.org/r/40415/#comment167200>

    Can you reuse Utilities' threadlocal kryo object here?



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 1964)
<https://reviews.apache.org/r/40415/#comment167198>

    I don't clearly understand this. Why do we populate corruptIds list if ppd is disabled?



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 2055)
<https://reviews.apache.org/r/40415/#comment167184>

    You can use the deserialized sarg as cloned object here. Utilities.clonePlan() does that.
Alternatively, you can try use kryo.copy() for deep copying object. There is a slight difference
though, serializing and deserializing for deep copy does not copy transient fields but kryo.copy()
will copy transient fields as well. HIVE-12424 for more info.



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 2061)
<https://reviews.apache.org/r/40415/#comment167195>

    Reading this back should get you cloned object.



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 2116)
<https://reviews.apache.org/r/40415/#comment167194>

    All ByteArrayOutputStream methods are synchronized. Should this be as well?



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java (line 3384)
<https://reviews.apache.org/r/40415/#comment167185>

    unused variable: conf



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java (line 3393)
<https://reviews.apache.org/r/40415/#comment167186>

    same here



ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java (line 1035)
<https://reviews.apache.org/r/40415/#comment167190>

    Will it be easy to add unit tests for this patch? Or qfile test.



storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java (line 173)
<https://reviews.apache.org/r/40415/#comment167189>

    This is still not deep copy. literal and literalList can be refs. I think we should use
kryo instead. This will also be fragile when we add fields to this class.


- Prasanth_J


On Nov. 18, 2015, 12:23 a.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40415/
> -----------------------------------------------------------
> 
> (Updated Nov. 18, 2015, 12:23 a.m.)
> 
> 
> Review request for hive, Gopal V, Prasanth_J, and Vikram Dixit Kumaraswamy.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 953e52c 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java c5e7a5f

>   metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java aa96f77 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 46862da 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 488d923 
>   ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java ec90481 
>   storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java dc71db4 
>   storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java d70b3b0

>   storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java eeff131

> 
> Diff: https://reviews.apache.org/r/40415/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message