phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Soldatov (JIRA)" <j...@apache.org>
Subject [jira] [Created] (PHOENIX-4586) UPSERT SELECT doesn't take in account comparison operators for subqueries.
Date Tue, 06 Feb 2018 22:27:01 GMT
Sergey Soldatov created PHOENIX-4586:
----------------------------------------

             Summary: UPSERT SELECT doesn't take in account comparison operators for subqueries.
                 Key: PHOENIX-4586
                 URL: https://issues.apache.org/jira/browse/PHOENIX-4586
             Project: Phoenix
          Issue Type: Bug
    Affects Versions: 4.14.0
            Reporter: Sergey Soldatov
             Fix For: 4.14.0


If upsert select has a where condition that is using any comparison operator (including ANY/SOME/etc),
the whole WHERE clause just ignored. Table:
{noformat}
create table T (id integer primary key, i1 integer);
upsert into T values (1,1);
upsert into T values (2,2);
{noformat}
Query that should not upsert anything because we have a condition in where that I1 should
be greater than any value we already have as well as not existing ID:
{noformat}
0: jdbc:phoenix:> upsert into T select id, 4 from T where id = 3 AND i1 > (select i1
from T);
2 rows affected (0.02 seconds)
0: jdbc:phoenix:> select * from T;
+-----+-----+
| ID  | I1  |
+-----+-----+
| 1   | 4   |
| 2   | 4   |
+-----+-----+
2 rows selected (0.014 seconds)
{noformat}
Now with ANY.  Should not upsert anything as well because ID is [1,2], while I1 are all '4':
{noformat}
0: jdbc:phoenix:> upsert into T select id, 5 from T where id = 2 AND i1 = ANY (select ID
from T);
2 rows affected (0.016 seconds)
0: jdbc:phoenix:> select * from T;
+-----+-----+
| ID  | I1  |
+-----+-----+
| 1   | 5   |
| 2   | 5   |
+-----+-----+
2 rows selected (0.013 seconds)
{noformat}
A similar query with IN works just fine:
{noformat}
0: jdbc:phoenix:> upsert into T select id, 6 from T where id = 2 AND i1 IN (select ID from
T);
No rows affected (0.094 seconds)
0: jdbc:phoenix:> select * from T;
+-----+-----+
| ID  | I1  |
+-----+-----+
| 1   | 5   |
| 2   | 5   |
+-----+-----+
2 rows selected (0.014 seconds)
{noformat}

The reason for this behavior is that for IN we convert subselect to semi-join and execute
upsert on the client side.  For comparisons, we don't perform any transformations and query
is considered flat and finally executed on the server side.  Not sure why, but we also completely
ignore the second condition in WHERE clause as well and that may lead to a serious data loss.

[~jamestaylor], [~maryannxue] any thoughts or suggestions how to fix that are really appreciated.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message