hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gabriel C Balan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-13377) Lost rows when using compact index on parquet table
Date Tue, 29 Mar 2016 18:50:25 GMT

    [ https://issues.apache.org/jira/browse/HIVE-13377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216620#comment-15216620
] 

Gabriel C Balan commented on HIVE-13377:
----------------------------------------

Gently pinging [~Ferd], [~dongc], [~spena], [~ashutoshc].

> Lost rows when using compact index on parquet table
> ---------------------------------------------------
>
>                 Key: HIVE-13377
>                 URL: https://issues.apache.org/jira/browse/HIVE-13377
>             Project: Hive
>          Issue Type: Bug
>          Components: Indexing
>    Affects Versions: 1.1.0
>         Environment: linux, cdh 5.5.0
>            Reporter: Gabriel C Balan
>            Priority: Minor
>
> Query with where clause on a parquet table loses rows when using a compact index. The
query produces the right results without the index.
> {code}
> create table small_parq(i int) stored as parquet;
> insert into table small_parq values (1), (2), (3), (4), (5), (6), (7), (8), (9), (10),
(11);
> set hive.optimize.index.filter=true;
> set hive.optimize.index.filter.compact.minsize=50;
> create index  comp_idx on table small_parq (i) as 'compact' WITH DEFERRED REBUILD;
> alter index comp_idx on small_parq rebuild;
> select * from small_parq where i=3;
> --this correctly produces 1 row (value 3).
> select * from small_parq where i=11;
> --this incorrectly produces 0 rows.
> --I see correct results when looking for a row in [1,6];
> --I see bad results when looking for a row in [7,11].
> --All is well once I disable the compact index
> set hive.optimize.index.filter.compact.minsize=50000000;
> select * from small_parq where i=11;
> --now it correctly produces 1 row (value 11).
> {code}
> It seems I can't reproduce this issue if the base table was ORC, SEQ, AVRO, TEXTFILE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message