phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sokolov Yura (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-4504) Subquery with ORDER BY on salted table gives wrong results
Date Wed, 27 Dec 2017 18:43:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-4504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16304728#comment-16304728
] 

Sokolov Yura commented on PHOENIX-4504:
---------------------------------------

Query results:
{code}
0: jdbc:phoenix:localhost:2181:/hbase> select id, ts from product_history_v3 where merchantid
= '1479114284851799852-2-11-118-1577502676' and ts < 1499472000000 and ts > 1498867200000
order by id, ts limit 30;
+-------------------------------------------+----------------+
|                    ID                     |       TS       |
+-------------------------------------------+----------------+
| 1464229783114162363-3-1-553-2857867552    | 1498936510676  |
| 1464229783114162363-3-1-553-2857867552    | 1498980130157  |
| 1464229783114162363-3-1-553-2857867552    | 1499047781507  |
| 1464229783114162363-3-1-553-2857867552    | 1499180409615  |
| 1464247850112432488-236-1-553-2644393506  | 1498915623062  |
| 1464247850112432488-236-1-553-2644393506  | 1498971253149  |
| 1464247850112432488-236-1-553-2644393506  | 1499048440459  |
| 1464247850112432488-236-1-553-2644393506  | 1499158314215  |
| 1464247850112432488-236-1-553-2644393506  | 1499259823270  |
| 1464247852027309711-255-1-553-3238414857  | 1498934879322  |
| 1464247852027309711-255-1-553-3238414857  | 1498979086034  |
| 1464247852027309711-255-1-553-3238414857  | 1499048440538  |
| 1464247852027309711-255-1-553-3238414857  | 1499179134343  |
| 1464247868867164868-81-1-553-620607591    | 1498877737342  |
| 1464247868867164868-81-1-553-620607591    | 1498961698314  |
| 1464247868867164868-81-1-553-620607591    | 1499048440623  |
| 1464247868867164868-81-1-553-620607591    | 1499134677065  |
| 1464247868867164868-81-1-553-620607591    | 1499223466036  |
| 1464247873075803152-102-1-553-831204716   | 1498917114334  |
| 1464247873075803152-102-1-553-831204716   | 1498971840572  |
| 1464247873075803152-102-1-553-831204716   | 1499048440705  |
| 1464247873075803152-102-1-553-831204716   | 1499159759145  |
| 1464247873075803152-102-1-553-831204716   | 1499261515858  |
| 1464247873268431014-104-1-553-1667649981  | 1498920527056  |
| 1464247873268431014-104-1-553-1667649981  | 1498972582415  |
| 1464247873268431014-104-1-553-1667649981  | 1499048440784  |
| 1464247873268431014-104-1-553-1667649981  | 1499107657865  |
| 1464247873268431014-104-1-553-1667649981  | 1499132841653  |
| 1464247873268431014-104-1-553-1667649981  | 1499139107062  |
| 1464247873268431014-104-1-553-1667649981  | 1499163196961  |
+-------------------------------------------+----------------+
30 rows selected (0,274 seconds)
{code}
{code}
0: jdbc:phoenix:localhost:2181:/hbase> select id, ts from product_history_v3 where (id,
ts) in (select id, ts from product_history_v3 where merchantid = '1479114284851799852-2-11-118-1577502676'
and ts < 1499472000000 and ts > 1498867200000 order by id, ts limit 30) order by id,
ts limit 30;
+---------------------------------------------+----------------+
|                     ID                      |       TS       |
+---------------------------------------------+----------------+
| 1464337455925822715-9-1-553-1737427551      | 1498867467198  |
| 1464342306748747866-224-1-553-2477984196    | 1498867741603  |
| 1464344584747196409-109-1-553-3617551231    | 1498867520459  |
| 1464354915226445545-12-1-553-3326867460     | 1498867329716  |
| 1464354951171259893-205-1-553-157712144     | 1498867937367  |
| 1465807522931047248-254-1-26341-3040374238  | 1498868565649  |
| 1465807593822096005-148-1-26341-2315313399  | 1498868335480  |
| 1465969384474095425-222-1-582-3798093076    | 1498868492803  |
| 1465970904962947474-201-1-582-2549082058    | 1498867460112  |
| 1465973068310049179-126-1-582-2560628236    | 1498867756532  |
| 1465976833000334290-144-1-26341-3964894838  | 1498869392946  |
| 1465980370698420387-241-1-26341-2279175928  | 1498867965885  |
| 1465980570230565465-153-1-582-1882378697    | 1498868981412  |
| 1465981770283025085-1-1-582-2291683338      | 1498869557486  |
| 1466076354192621871-148-1-582-3639476582    | 1498869573669  |
| 1466247928637235332-189-1-26341-29981431    | 1498869562923  |
| 1466343753985642490-1-1-582-3786071318      | 1498869288741  |
| 1466510716584116158-92-1-26341-3274171161   | 1498869578179  |
| 1466516750800325950-185-1-26341-3458616731  | 1498869582376  |
| 1466601802728707669-241-1-26341-3251933984  | 1498869592565  |
| 1466601895261888572-226-1-26341-1997428701  | 1498869598320  |
| 1469086675558023209-103-1-26341-2051254469  | 1498869409408  |
| 1470471122450741988-251-1-26341-3428426480  | 1498868080181  |
| 1473755155974969573-84-1-26312-936070239    | 1498869371107  |
| 1474977282760354022-97-1-582-3584026058     | 1498869323363  |
| 1476515701269758184-93-1-26341-2090804045   | 1498868961391  |
| 1479381860049720395-36-1-629-1697505625     | 1498868108260  |
| 1479800887135410698-115-1-582-3133031776    | 1498867326627  |
| 1480820701323905492-186-1-629-554270586     | 1498867888693  |
| 1482475116359338911-164-1-26341-583821995   | 1498867459225  |
+---------------------------------------------+----------------+
30 rows selected (2,693 seconds)
{code}
{code}
0: jdbc:phoenix:localhost:2181:/hbase> select id, ts from product_history_v3 where (id,
ts) in (select id, ts from product_history_v3 where merchantid = '1479114284851799852-2-11-118-1577502676'
and ts < 1499472000000 and ts > 1498867200000 order by id||'-', ts limit 30) order by
id, ts limit 30;
+-------------------------------------------+----------------+
|                    ID                     |       TS       |
+-------------------------------------------+----------------+
| 1464229783114162363-3-1-553-2857867552    | 1498936510676  |
| 1464229783114162363-3-1-553-2857867552    | 1498980130157  |
| 1464229783114162363-3-1-553-2857867552    | 1499047781507  |
| 1464229783114162363-3-1-553-2857867552    | 1499180409615  |
| 1464247850112432488-236-1-553-2644393506  | 1498915623062  |
| 1464247850112432488-236-1-553-2644393506  | 1498971253149  |
| 1464247850112432488-236-1-553-2644393506  | 1499048440459  |
| 1464247850112432488-236-1-553-2644393506  | 1499158314215  |
| 1464247850112432488-236-1-553-2644393506  | 1499259823270  |
| 1464247852027309711-255-1-553-3238414857  | 1498934879322  |
| 1464247852027309711-255-1-553-3238414857  | 1498979086034  |
| 1464247852027309711-255-1-553-3238414857  | 1499048440538  |
| 1464247852027309711-255-1-553-3238414857  | 1499179134343  |
| 1464247868867164868-81-1-553-620607591    | 1498877737342  |
| 1464247868867164868-81-1-553-620607591    | 1498961698314  |
| 1464247868867164868-81-1-553-620607591    | 1499048440623  |
| 1464247868867164868-81-1-553-620607591    | 1499134677065  |
| 1464247868867164868-81-1-553-620607591    | 1499223466036  |
| 1464247873075803152-102-1-553-831204716   | 1498917114334  |
| 1464247873075803152-102-1-553-831204716   | 1498971840572  |
| 1464247873075803152-102-1-553-831204716   | 1499048440705  |
| 1464247873075803152-102-1-553-831204716   | 1499159759145  |
| 1464247873075803152-102-1-553-831204716   | 1499261515858  |
| 1464247873268431014-104-1-553-1667649981  | 1498920527056  |
| 1464247873268431014-104-1-553-1667649981  | 1498972582415  |
| 1464247873268431014-104-1-553-1667649981  | 1499048440784  |
| 1464247873268431014-104-1-553-1667649981  | 1499107657865  |
| 1464247873268431014-104-1-553-1667649981  | 1499132841653  |
| 1464247873268431014-104-1-553-1667649981  | 1499139107062  |
| 1464247873268431014-104-1-553-1667649981  | 1499163196961  |
+-------------------------------------------+----------------+
30 rows selected (1,364 seconds)
{code}

> Subquery with ORDER BY on salted table gives wrong results
> ----------------------------------------------------------
>
>                 Key: PHOENIX-4504
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4504
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.11.0
>         Environment: amazon emr phoenix 4.11.0 hbase 1.3
>            Reporter: Sokolov Yura
>
> Probably it is already fixed. Having a quick search I didn't find exact problem description.
> I have a table:
> {code:sql}
> create immutable table product_history_v3 (
>         ts bigint not null,
>         id varchar not null,
>         product varchar,
>         merchantid varchar,
>         storeid varchar,
>         constraint pk primary key (ts, id)
> ) compression=LZ4,max_filesize=150000000,memstore_flushsize=70000000,
>         versions=1,update_cache_frequency=1000,append_only_schema=true,
>         guid_posts_width=10000000,
> SALT_BUCKETS=20;
> create local index product_history_v3_id_ts on product_history_v3 (id, ts) compression=LZ4;
> create local index product_history_v3_merchantid_ts on product_history_v3 (merchantid,
ts) include (id) compression=LZ4;
> create local index product_history_v3_storeid_ts on product_history_v3 (storeid, ts)
include (id) compression=LZ4;
> {code}
> Simple select by merchanid ordering by id,ts returns correct results:
> {code:sql}
> 0: jdbc:phoenix:localhost:2181:/hbase> explain select id, ts from product_history_v3
where merchantid = '1479114284851799852-2-11-118-1577502676' and ts < 1499472000000 and
ts > 1498867200000 order by id, ts limit 30;
> +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> |                                                                                   
                 PLAN                                                                    
                                |
> +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> | CLIENT 20-CHUNK 0 ROWS 0 BYTES PARALLEL 20-WAY RANGE SCAN OVER PRODUCT_HISTORY_V3 [2,'1479114284851799852-2-11-118-1577502676',1498867200001]
- [2,'1479114284851799852-2-11-118-1577502676',1499472000000]  |
> |     SERVER FILTER BY FIRST KEY ONLY                                               
                                                                                         
                                |
> |     SERVER TOP 30 ROWS SORTED BY ["ID", "TS"]                                     
                                                                                         
                              |
> | CLIENT MERGE SORT                                                                 
                                                                                         
                                |
> | CLIENT LIMIT 30                                                                   
                                                                                         
                             |
> +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> 5 rows selected (0,019 seconds)
> {code}
> It runs very fast until I add {{product}} to selected fields (cause average length of
{{product}} is 10kb).
> So I'm trying to fetch id,ts in subquery, and product in outer query. It runs fast, but
returns incorrect results: set of rows doesn't match to set of rows returned by query above.
> {code}
> 0: jdbc:phoenix:localhost:2181:/hbase> explain select id, ts, substr(product,1,30)
from product_history_v3 where (id, ts) in (select id, ts from product_history_v3 where merchantid
= '1479114284851799852-2-11-118-1577502676' and ts < 1499472000000 and ts > 1498867200000
order by id, ts limit 30) order by id, ts limit 30;
> +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> |                                                                                   
                     PLAN                                                                
                                        |
> +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> | CLIENT 40-CHUNK 915204 ROWS 6291521905 BYTES PARALLEL 40-WAY FULL SCAN OVER PRODUCT_HISTORY_V3
> |     SERVER TOP 30 ROWS SORTED BY [PRODUCT_HISTORY_V3.ID, PRODUCT_HISTORY_V3.TS]
> | CLIENT MERGE SORT
> | CLIENT LIMIT 30
> |     SKIP-SCAN-JOIN TABLE 0
> |         CLIENT 20-CHUNK 0 ROWS 0 BYTES PARALLEL 20-WAY RANGE SCAN OVER PRODUCT_HISTORY_V3
[2,'1479114284851799852-2-11-118-1577502676',1498867200001] - [2,'1479114284851799852-2-11-118-1577502676',1499472000000]
> |             SERVER FILTER BY FIRST KEY ONLY
> |             SERVER AGGREGATE INTO ORDERED DISTINCT ROWS BY ["ID", "TS"] LIMIT 30 GROUPS
> |         CLIENT MERGE SORT
> |         CLIENT 30 ROW LIMIT
> |     DYNAMIC SERVER FILTER BY (PRODUCT_HISTORY_V3.TS, PRODUCT_HISTORY_V3.ID) IN (($470.$473,
$470.$472))
> +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> 11 rows selected (0,021 seconds)
> {code}
> However, if I change ordering a bit, so planner is forced for reordering, then set of
rows is equal to original query:
> {code}
> 0: jdbc:phoenix:localhost:2181:/hbase> explain select id, ts, substr(product,1,30)
from product_history_v3 where (id, ts) in (select id, ts from product_history_v3 where merchantid
= '1479114284851799852-2-11-118-1577502676' and ts < 1499472000000 and ts > 1498867200000
order by id||'-', ts limit 30) order by id, ts limit 30;
> +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> |                                                                                   
                     PLAN
> +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> | CLIENT 40-CHUNK 915204 ROWS 6291521905 BYTES PARALLEL 40-WAY FULL SCAN OVER PRODUCT_HISTORY_V3
> |     SERVER TOP 30 ROWS SORTED BY [PRODUCT_HISTORY_V3.ID, PRODUCT_HISTORY_V3.TS]
> | CLIENT MERGE SORT
> | CLIENT LIMIT 30
> |     SKIP-SCAN-JOIN TABLE 0
> |         CLIENT 20-CHUNK 0 ROWS 0 BYTES PARALLEL 20-WAY RANGE SCAN OVER PRODUCT_HISTORY_V3
[2,'1479114284851799852-2-11-118-1577502676',1498867200001] - [2,'1479114284851799852-2-11-118-1577502676',1499472000000]
> |             SERVER FILTER BY FIRST KEY ONLY
> |             SERVER AGGREGATE INTO ORDERED DISTINCT ROWS BY ["ID", "TS"]
> |         CLIENT MERGE SORT
> |         CLIENT TOP 30 ROWS SORTED BY [("ID" || '-'), "TS"]
> |     DYNAMIC SERVER FILTER BY (PRODUCT_HISTORY_V3.TS, PRODUCT_HISTORY_V3.ID) IN (($494.$497,
$494.$496))
> +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+
> 11 rows selected (0,02 seconds)
> 12 rows selected (0,021 seconds)
> {code}
> There, certainly, should be a lot of rows to trigger this behaviour.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message