hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mohit Sabharwal" <mo...@cloudera.com>
Subject Re: Review Request 32499: HIVE-10086: Hive throws error when accessing Parquet file schema using field name match
Date Thu, 26 Mar 2015 19:11:16 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32499/#review77924
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java
<https://reviews.apache.org/r/32499/#comment126274>

    why remove static ?



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java
<https://reviews.apache.org/r/32499/#comment126279>

    Looks like this method is called recursively (to deal with nested fields). Can we have
duplicate column names across nesting levels ?



ql/src/test/queries/clientpositive/parquet_schema_evolution.q
<https://reviews.apache.org/r/32499/#comment126280>

    Add a case where structs are nested (struct inside scruct) ?


- Mohit Sabharwal


On March 25, 2015, 10:42 p.m., Sergio Pena wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32499/
> -----------------------------------------------------------
> 
> (Updated March 25, 2015, 10:42 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-10086
>     https://issues.apache.org/jira/browse/HIVE-10086
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Attached is the patch that handles schema that do not match between Parquet and Hive.
> 
> The access to Parquet data is with name matching in this case. The table column may have
different schema order, but if the name matches the parquet column name, then the value is
retrieved.
> 
> Also, if the Hive schema has columns and struct elements that do not match with the Parquet
schema, then it will return NULL values instead.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java
57ae7a9740d55b407cadfc8bc030593b29f90700 
>   ql/src/test/queries/clientpositive/parquet_schema_evolution.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/parquet_table_with_subschema.q PRE-CREATION 
>   ql/src/test/results/clientpositive/parquet_schema_evolution.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/parquet_table_with_subschema.q.out PRE-CREATION

> 
> Diff: https://reviews.apache.org/r/32499/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message