hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guodong Wang <>
Subject multiple insert clauses for the same table
Date Sat, 11 Oct 2014 02:33:32 GMT
I am using Hive 0.12.0. And when putting multiple inserts for the same
table in one SQL, looks like hive queries plan analyzer fails to synthesis
the right plan.

Here is the issue.

create table T1(i int, j int);
create table T2(m int) partitioned by (n int);
explain from T1
insert into table T2 partition (n = 1)
  select T1.i where T1.j = 1
insert overwrite table T2 partition (n = 2)
  select T1.i where T1.j = 2

When there is a "insert into" clause in the multiple insert part, the
"insert overwrite" is considered as "insert into".

I dig into the source code, looks like Hive does not support mixing "insert
into" and "insert overwrite" for the same table in multiple insert clauses.

Here is my finding.
1. in semantic analyzer, when processing TOK_INSERT_INTO, the analyzer will
put the table name into a set which contains all the insert into table
2. when generate file sink plan, the analyzer will check if the table name
is in the set, if in the set, the replace flag is set to false. Here is the
code snippet.
      // Create the work for moving the table
      // NOTE: specify Dynamic partitions in dest_tab for WriteEntity
      if (!isNonNativeTable) {
        ltd = new LoadTableDesc(queryTmpdir,
            table_desc, dpCtx);


        if (holdDDLTime) {
"this query will not update transient_lastDdlTime!");

My question is
1. is this a limitation for hive multiple insert clauses? According to the
hive HQL manual, it does not mention this limitation.
2. or is this a bug in hive analyzer?



View raw message