hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prasanth Jayachandran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-11035) PPD: Orc Split elimination fails because filterColumns=[-1]
Date Wed, 17 Jun 2015 22:19:01 GMT

    [ https://issues.apache.org/jira/browse/HIVE-11035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14590728#comment-14590728
] 

Prasanth Jayachandran commented on HIVE-11035:
----------------------------------------------

This is a regression introduced by HIVE-7052. As a result of this regression, the first column
id is not pushed down properly but the column name exists. For the example in the description,
the value for hive.io.file.readcolumn.ids in conf is empty and the value for hive.io.file.readcolumn.names
is ",x" (comma is prepended). Making the CSV joiner more reliable fixes the issue and pushes
the column ids properly which is then used by SARG for populating filterColumns array.

> PPD: Orc Split elimination fails because filterColumns=[-1]
> -----------------------------------------------------------
>
>                 Key: HIVE-11035
>                 URL: https://issues.apache.org/jira/browse/HIVE-11035
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 1.2.0, 1.3.0, 2.0.0
>            Reporter: Gopal V
>            Assignee: Prasanth Jayachandran
>         Attachments: HIVE-11035.patch
>
>
> {code}
> create temporary table xx (x int) stored as orc ;
> insert into xx values (20),(200);
> set hive.fetch.task.conversion=none;
> select * from xx where x is null;
> {code}
> This should generate zero tasks after optional split elimination in the app master, instead
of generating the 1 task which for sure hits the row-index filters and removes all rows anyway.
> Right now, this runs 1 task for the stripe containing (min=20, max=200, has_null=false),
which is broken.
> Instead, it returns YES_NO_NULL from the following default case
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java#L976



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message