hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eugene Koifman (JIRA)" <>
Subject [jira] [Commented] (HIVE-15048) Update/Delete statement using wrong WriteEntity when subqueries are involved
Date Fri, 09 Dec 2016 20:10:59 GMT


Eugene Koifman commented on HIVE-15048:

That is not what it does.  The code removes the table WriteEntity for target table and replaces
it with some number of partition WriteEntity objects for that table.
So conceptually it does the same thing as before.

If you look at the new .q.out, the output shows the set inputs/outputs that it ends up with
(not clearly highlight but they are there)

> Update/Delete statement using wrong WriteEntity when subqueries are involved
> ----------------------------------------------------------------------------
>                 Key: HIVE-15048
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 1.0.0
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>            Priority: Critical
>         Attachments: HIVE-15048.01.patch, HIVE-15048.02.patch, HIVE-15048.03.patch, HIVE-15048.04.patch
> See TestDbTxnManager2 for referenced methods
> {noformat}
>     checkCmdOnDriver("create table target (a int, b int) " +
>       "partitioned by (p int, q int) clustered by (a) into 2  buckets " +
>       "stored as orc TBLPROPERTIES ('transactional'='true')"));
>     checkCmdOnDriver("create table source (a1 int, b1 int, p1 int, q1 int)
clustered by (a1) into 2  buckets stored as orc TBLPROPERTIES ('transactional'='true')"));
>     checkCmdOnDriver("insert into target partition(p,q) values (1,2,1,2),
(3,4,1,2), (5,6,1,3), (7,8,2,2)"));
>     checkCmdOnDriver(
>       "update source set b1 = 1 where p1 in (select t.q from target t where t.p=2)"));
> {noformat}
> The last Update stmt creates the following Entity objects in the QueryPlan
> inputs: [default@source, default@target, default@target@p=2/q=2]
> outputs: [default@target@p=2/q=2]
> Which is clearly wrong for outputs - the target table is not even partitioned(or called
> This happens in UpdateDeleteSemanticAnalyzer.reparseAndSuperAnalyze()
> I suspect 
> update T ... where T.p IN (select d from T where ...) 
> type query would also get messed up (but not necessarily fail) if T is partitioned and
the subquery filters out some partitions but that does not mean that the same partitions are
filtered out in the parent query.

This message was sent by Atlassian JIRA

View raw message