hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergio Pena" <sergio.p...@cloudera.com>
Subject Re: Review Request 34473: HIVE-10749 Implement Insert statement for parquet
Date Thu, 21 May 2015 19:18:18 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34473/#review84758
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
<https://reviews.apache.org/r/34473/#comment136104>

    Could you separate words with _? Like ENABLE_ACID_SCHEMA_INFO. It helps to read the constant
more easily.
    
    Do we have to enable transactions exclusively for parquet? Isn't there another variable
that enables trasnactions on Hive that we can use?



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
<https://reviews.apache.org/r/34473/#comment136111>

    Could you separate the workds? Like ENABLE_ACID_SCHEMA_INFO. It makes the code more readable.
    
    Also, isn't there another variable that we can use to detect if transactions are enabled?
I am not sure if we should add more variables to Hive.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
<https://reviews.apache.org/r/34473/#comment136107>

    You can use this one line to return the column list:
    
    return (List<String>) StringUtils.getStringCollection(tableProperties.getProperty(IOConstants.COLUMNS));
    
    It will return an empty list array if COLUMN is empty.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
<https://reviews.apache.org/r/34473/#comment136112>

    You can save code by using this line:
    
    return (List<String>) StringUtils.getStringCollection(tableProperties.getProperty(IOConstants.COLUMNS));
    
    It will return an empty list if the parameter is empty.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
<https://reviews.apache.org/r/34473/#comment136108>

    You can call TypeInfoUtils.getTypeINfosFromTypeString() with an empty string here. It
will return an empty list. Let's save code by using:
    
    ArrayList<TypeInfo> columnTypes = TypeInfoUtils.getTypeInfosFromTypeString(columnTypeProperty);



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
<https://reviews.apache.org/r/34473/#comment136113>

    You can save code by using this line:
    
    ArrayList<TypeInfo> columnTypes = TypeInfoUtils.getTypeInfosFromTypeString(columnTypeProperty);
    
    It will return an empty list if the parameter is empty.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
<https://reviews.apache.org/r/34473/#comment136109>

    Same here, you can save code with this:
    
    ArrayList<String> columnNames = (ArrayList<String>) StringUtils.getStringCollection(columnNameProperty);



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
<https://reviews.apache.org/r/34473/#comment136114>

    Same thing here:
    
    ArrayList<String> columnNames = (ArrayList<String>) StringUtils.getStringCollection(columnNameProperty);



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java
<https://reviews.apache.org/r/34473/#comment136117>

    Why do you need a Writable? HIVE-9658 tries to avoid wrapping java types into writable
if they are being used by Hive to save memory usage.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetStructObjectInspector.java
<https://reviews.apache.org/r/34473/#comment136116>

    I am waiting to commit the patch from HIVE-10749 that uses a similar class named ObjectArrayWritableObjectInspector.

    
    Also, I think this is already part o the parquet branch.


- Sergio Pena


On May 21, 2015, 7:45 a.m., cheng xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34473/
> -----------------------------------------------------------
> 
> (Updated May 21, 2015, 7:45 a.m.)
> 
> 
> Review request for hive, Alan Gates and Sergio Pena.
> 
> 
> Bugs: HIVE-10749
>     https://issues.apache.org/jira/browse/HIVE-10749
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Implement the insert statement for parquet format.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java c6fb26c

>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/acid/ParquetRecordUpdater.java PRE-CREATION

>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java
f513572 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetStructObjectInspector.java
PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/acid/TestParquetRecordUpdater.java
PRE-CREATION 
>   ql/src/test/queries/clientpositive/acid_parquet_insert.q PRE-CREATION 
>   ql/src/test/results/clientpositive/acid_parquet_insert.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/34473/diff/
> 
> 
> Testing
> -------
> 
> Newly added qtest and UT passed locally
> 
> 
> Thanks,
> 
> cheng xu
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message