hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ferdinand Xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter
Date Mon, 11 Sep 2017 02:09:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16160552#comment-16160552
] 

Ferdinand Xu commented on HIVE-17261:
-------------------------------------

Thanks Junjie Chen for the patch.
One comment is not addressed:
In ParquetRecordReaderBase.java
* Please remove @ Depercated annotation since we are not using the deprecated constructor
in L65

A few more comments left:
In ParquetRecordReaderBase.java
* Remove the unnecessary return in L131
In TestParquetRowGroupFilter.java
* Since the filter is taking effect automatically within Parquet reader, we should add test
cases to ensure its functionality in reader level while current tests are only focusing on
the functionality of RowGroupFilter.filterRowGroups.
 
Could you create a review board next time for review? Thank you!

> Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter
> -----------------------------------------------------------------------------------------
>
>                 Key: HIVE-17261
>                 URL: https://issues.apache.org/jira/browse/HIVE-17261
>             Project: Hive
>          Issue Type: Improvement
>          Components: Database/Schema
>    Affects Versions: 2.2.0
>            Reporter: Junjie Chen
>            Assignee: Junjie Chen
>         Attachments: HIVE-17261.2.patch, HIVE-17261.3.patch, HIVE-17261.4.patch, HIVE-17261.5.patch,
HIVE-17261.6.patch, HIVE-17261.diff, HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary filter in
parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message