hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Vary (Jira)" <j...@apache.org>
Subject [jira] [Comment Edited] (HIVE-23143) Transactions: PPD in Delete deltas is broken
Date Mon, 06 Apr 2020 16:03:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-23143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17076431#comment-17076431
] 

Peter Vary edited comment on HIVE-23143 at 4/6/20, 4:02 PM:
------------------------------------------------------------

Good point [~asomani]!
Currently we suggest the customers to try to avoid using "reserved" column names, but I absolutely
agree that this is not a good solution. I you have a good idea/patch, I would be happy to
review.

Thanks,
Peter


was (Author: pvary):
Good point [~asomani]! Any ideas how to solve this?

> Transactions: PPD in Delete deltas is broken
> --------------------------------------------
>
>                 Key: HIVE-23143
>                 URL: https://issues.apache.org/jira/browse/HIVE-23143
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>            Reporter: Abhishek Somani
>            Priority: Major
>
> The optimization introduced in HIVE-16812 seems broken. PPD is not happening for delete
deltas, and in fact, also causes wrong results if data column names conflict with ACID ROW__ID
column names (bucket, originalTransactionId etc).
> This seems to be happening because after ORC-491, all PPD happens in data columns only
for ACID orc files, so the filters for delete PPD never get applied on metadata columns and
try to apply to data columns instead. And when the data columns have a column name (like "bucket"
in the below example), it returns wrong results. 
> Steps to repro:
> {code:java}
> set hive.fetch.task.conversion=none;
> set hive.query.results.cache.enabled=false;
> create table test(a int, bucket int) stored as orc tblproperties("transactional"="true");
> insert into table test values (1, 1111), (2, 2222), (3, 3333);
> delete from test where a = 2;
> select * from test; //Will return the deleted row as well
> set hive.txn.filter.delete.events=false;
> select * from test; //Correct results returned. Will not return the deleted row
> {code}
> cc [~pvary] [~gopalv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message