hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Yeom (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-19186) Multi Table INSERT statements query has a flaw for partitioned table when INSERT INTO and INSERT OVERWRITE are used
Date Wed, 18 Apr 2018 00:15:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-19186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16441692#comment-16441692
] 

Steve Yeom commented on HIVE-19186:
-----------------------------------

Hi [~ashutoshc] can you review this patch? 

For the case of multi table insert, each insert is designated by the clause name which is
"dest" in getFileSinkPlan() method context.
For our test case of multi table inserts query with INSERT INTO and INSERT OVERWRITE, we call
getFileSinkPlan() for each INSERT clause. 
The issue of the Jira is that for the case of INSERT OVERWRITE we just set true to the "isInsertInto"
flag to have WRONG loadType. 
The fix is to correct that flag value.

As you see from the newly added "multi_insert_partitioned.q" statistics and metadata looks
OK. I have double checked the results 
and "DESC FORMATTED" statment output by 
partitioning a multi table insert query into queries with single insert statement. 

> Multi Table INSERT statements query has a flaw for partitioned table when INSERT INTO
and INSERT OVERWRITE are used
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-19186
>                 URL: https://issues.apache.org/jira/browse/HIVE-19186
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Planning
>    Affects Versions: 3.0.0
>            Reporter: Steve Yeom
>            Assignee: Steve Yeom
>            Priority: Major
>             Fix For: 3.0.0
>
>         Attachments: HIVE-19186.01.patch, HIVE-19186.02.patch
>
>
> One problem test case is: 
> create table intermediate(key int) partitioned by (p int) stored as orc;
> insert into table intermediate partition(p='455') select distinct key from src where
key >= 0 order by key desc limit 2;
> insert into table intermediate partition(p='456') select distinct key from src where
key is not null order by key asc limit 2;
> insert into table intermediate partition(p='457') select distinct key from src where
key >= 100 order by key asc limit 2;
> create table multi_partitioned (key int, key2 int) partitioned by (p int);
> from intermediate
> insert into table multi_partitioned partition(p=2) select p, key
> insert overwrite table multi_partitioned partition(p=1) select key, p;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message