hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marta Kuczora (Jira)" <j...@apache.org>
Subject [jira] [Comment Edited] (HIVE-23114) Insert overwrite with dynamic partitioning is not working correctly with direct insert
Date Tue, 31 Mar 2020 15:30:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-23114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17071894#comment-17071894
] 

Marta Kuczora edited comment on HIVE-23114 at 3/31/20, 3:29 PM:
----------------------------------------------------------------

The implementation in HIVE-21164 is built upon the implementation of the insert-only (or also
called as mm) tables. And it seems that that original implementation doesn't handle well the
use-cases where multiple FileSinkOperators are present in one task and these FileSinkOperators
are writing the same table. And the query written in the reproduction steps is exactly that
type of query.
This issue happens with multi-insert queries only if dynamic partitions are involved. In this
case we will end up with two FileSinkOperators within one task and each of them will write
to the same table.
Some basic steps what a FileSinkOperator does is the following:
- Writes the data
- When it's finished in the closeOp it creates a manifest file which will contain the successfully
written data files. 
- Then in the end in the jobCloseOp it reads the manifest file and cleans up all files which
are written to the table but not in the manifest file
There are multiple places where problem can occur, it depends on in what order the closeOp
and jobCloseOp methods of each FileSinkOperators are executed.
It can cause collision in the manifest file creation as both FileSinkOperators will try to
create it with the same path. It can also happen that one FileSinkOperator deletes the data
written by the other FileSinkOperator. It really depends on the order of execution of the
FileSinkOperator's methods. 

The FileSinkOperators of the query written in the reproduction steps are writing data to different
partitions. One is writing into the partitions where c is not null and the other one is writing
to the partition where c is null. I propose a fix for this use-case in this Jira. 

It can happen that the multiple FileSinkOperators are writing data into overlapping partitions,
but this use-case is not straightforward to fix. The way how an insert overwrite query works
with overlapping partitions is not consistent between table types. For external tables, the
latest FileSinkOperator will overwrite the previously written data, for ACID tables the data
written by the first FileSinkOperator will be the final result. So this use case needs more
clarification, but until it is fixed a workaround can be to switch off direct insert or to
split the query into separate insert overwrite statements.


was (Author: kuczoram):
The implementation in HIVE-21164 is built upon the implementation of the insert-only (or also
called as mm) tables. And it seems that that original implementation doesn't handle well the
use-cases where multiple FileSinkOperators are present in one task and these FileSinkOperators
are writing the same table. And the query written in the reproduction steps is exactly that
type of query.
This issue happens with multi-insert queries only if dynamic partitions are involved. In this
case we will end up with two FileSinkOperators within one task and each of them will write
to the same table.
Some basic steps what a FileSinkOperator does is the following:
- Writes the data
- When it's finished in the closeOp it creates a manifest file which will contain the successfully
written data files. 
- Then in the end in the jobCloseOp it reads the manifest file and cleans up all files which
are written to the table but not in the manifest file
There are multiple places where problem can occur, it depends on in what order the closeOp
and jobCloseOp methods of each FileSinkOperators are executed.
It can cause collision in the manifest file creation as both FileSinkOperators will try to
create it with the same path. It can also happen that one FileSinkOperator deletes the data
written by the other FileSinkOperator. It really depends on the order of execution of the
FileSinkOperator's methods. 

The FileSinkOperators of the query written in the reproduction steps are writing data to different
partitions. One is writing into the partitions where c is not null and the other one is writing
to the partitions where c is null. I propose a fix for this use-case in this Jira. It can
happen that the multiple FileSinkOperators are writing data into overlapping partitions, but
this use-case is not straightforward to fix. The way how an insert overwrite query works with
overlapping partitions is not consistent between table types. For external tables, the latest
FileSinkOperator will overwrite the previously written data, for ACID tables the data written
by the first FileSinkOperator will be the final result. So this use case needs more clarification,
but until it is fixed a workaround can be to switch off direct insert or to split the query
into separate insert overwrite statements.

> Insert overwrite with dynamic partitioning is not working correctly with direct insert
> --------------------------------------------------------------------------------------
>
>                 Key: HIVE-23114
>                 URL: https://issues.apache.org/jira/browse/HIVE-23114
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Marta Kuczora
>            Assignee: Marta Kuczora
>            Priority: Major
>
> This is a follow-up Jira for the [conversation|https://issues.apache.org/jira/browse/HIVE-21164?focusedCommentId=17059280&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17059280]
in HIVE-21164
>  Doing an insert overwrite from a multi-insert statement with dynamic partitioning will
give wrong results for ACID tables when 'hive.acid.direct.insert.enabled' is true or for insert-only
tables.
> Reproduction:
> {noformat}
> set hive.acid.direct.insert.enabled=true;
> set hive.support.concurrency=true;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> set hive.vectorized.execution.enabled=false;
> set hive.stats.autogather=false;
> create external table multiinsert_test_text (a int, b int, c int) stored as textfile;
> insert into multiinsert_test_text values (1111, 11, 1111), (2222, 22, 1111), (3333, 33,
2222), (4444, 44, NULL), (5555, 55, NULL);
> create table multiinsert_test_acid (a int, b int) partitioned by (c int) stored as orc
tblproperties('transactional'='true');
> create table multiinsert_test_mm (a int, b int) partitioned by (c int) stored as orc
tblproperties('transactional'='true', 'transactional_properties'='insert_only');
> from multiinsert_test_text a
> insert overwrite table multiinsert_test_acid partition (c)
> select
>  a.a,
>  a.b,
>  a.c
>  where a.c is not null
> insert overwrite table multiinsert_test_acid partition (c)
> select
>  a.a,
>  a.b,
>  a.c
> where a.c is null;
> select * from multiinsert_test_acid;
> from multiinsert_test_text a
> insert overwrite table multiinsert_test_mm partition (c)
> select
>  a.a,
>  a.b,
>  a.c
>  where a.c is not null
> insert overwrite table multiinsert_test_mm partition (c)
> select
>  a.a,
>  a.b,
>  a.c
> where a.c is null;
> select * from multiinsert_test_mm;
> {noformat}
> The result of these steps can be different, it depends on the execution order of the
FileSinkOperators of the insert overwrite statements. It can happen that an error occurs due
to manifest file collision, it can happen that no error occurs but the result will be incorrect.
>  Running the same insert query with an external table of with and ACID table with 'hive.acid.direct.insert.enabled=false'
will give the follwing result:
> {noformat}
> 1111    11      1111
> 2222    22      1111
> 3333    33      2222
> 4444    44      NULL
> 5555    55      NULL
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message