hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From cheng xu <cheng.a...@intel.com>
Subject Re: Review Request 48716: HIVE-13873 Column pruning for nested fields
Date Thu, 07 Jul 2016 01:02:54 GMT


> On July 6, 2016, 10:48 p.m., Aihua Xu wrote:
> > serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java, line 122
> > <https://reviews.apache.org/r/48716/diff/1/?file=1419370#file1419370line122>
> >
> >     Just try to understand the logic (not too familiar with Parquet). So the underneath
parquet already supports "hive.io.file.readgroup.paths" or this is totally within hive? How
are the struct data stored in parquet and pruned with the group path in general?

Parquet doesn't support this configuration. We reconstruct the requested schema in Hive side
by pruning unneeded columns like other projection does.


- cheng


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48716/#review140991
-----------------------------------------------------------


On June 15, 2016, 11:34 a.m., cheng xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48716/
> -----------------------------------------------------------
> 
> (Updated June 15, 2016, 11:34 a.m.)
> 
> 
> Review request for hive and Xuefu Zhang.
> 
> 
> Bugs: HIVE-13873
>     https://issues.apache.org/jira/browse/HIVE-13873
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Add group projection support for Parquet and this is the initial patch sharing my thoughts.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java dff1815 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 23abec3 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java 6afe957 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 24bf506 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java cfedf35 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ProjectionPusher.java db923fa 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveStructConverter.java a89aa4d

>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java
3e38cc7 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java
74a1a82 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcCtx.java 611a6b7 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java a2a7f00

>   ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 8cf261d 
>   ql/src/test/queries/clientpositive/parquet_struct.q PRE-CREATION 
>   ql/src/test/results/clientpositive/parquet_struct.q.out PRE-CREATION 
>   serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java 0c7ac30 
> 
> Diff: https://reviews.apache.org/r/48716/diff/
> 
> 
> Testing
> -------
> 
> Newly added qtest passed.
> 
> 
> Thanks,
> 
> cheng xu
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message