phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amarnath Ramamoorthi (Jira)" <j...@apache.org>
Subject [jira] [Updated] (PHOENIX-5490) Queries with leading part of the primary key doing a parallel scan causes missing data
Date Thu, 26 Sep 2019 14:58:00 GMT

     [ https://issues.apache.org/jira/browse/PHOENIX-5490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Amarnath Ramamoorthi updated PHOENIX-5490:
------------------------------------------
    Description: 
Table properties
 * saltbucket_size=10
 * compression=snappy
 * normalization_enabled=true
 * SPLIT POLICY = ConstantSizeRegionSplitPolicy
 * SALT_BUCKETS = 10 (on creation)
 * ENCODING = FAST_DIFF

PRIMARY KEY ("id_1", "id_2", "id_3", "id_4", "id_5", "id_6", "id")

Partial primary key lookup is not working as expected.

On using only one key "id_1" = 60268375; should return 2 rows
{code:java|title=Parallel scan shows 1 rows}
0: jdbc:phoenix:labs-test-namenode-lv-101,labs> explain SELECT "id_1", "id_2", "id_3",
"id_4", "id_5", "id_6", "id" FROM "foo"."BIG_TABLE" WHERE "id_1" = 60268375;
+--------------------------------------------------------------------------------------------------------------------------+
|                                                           PLAN                         
                                 |
+--------------------------------------------------------------------------------------------------------------------------+
| CLIENT 8-CHUNK 0 ROWS 0 BYTES PARALLEL 8-WAY ROUND ROBIN RANGE SCAN OVER foo.BIG_TABLE [0,60268375]
- [9,60268375]  |
|     SERVER FILTER BY FIRST KEY ONLY                                                    
                                 |
+--------------------------------------------------------------------------------------------------------------------------+
2 rows selected (0.047 seconds)
0: jdbc:phoenix:labs-test-namenode-lv-101,labs> SELECT "id_1", "id_2", "id_3", "id_4",
"id_5", "id_6", "id" FROM "foo"."BIG_TABLE" WHERE "id_1" = 60268375;
+-----------+-------------+--------+------------+------------+------------+-------------+
|   id_1   |    id_2    | id_3  |   id_4    |   id_5   |  id_6   |     id      |
+-----------+-------------+--------+------------+------------+------------+-------------+
| 60268375  | 1306309959  | 4      | 365224321  | 359338114  | 523433837  | 1052445537  |
+-----------+-------------+--------+------------+------------+------------+-------------+
1 row selected (3.24 seconds)
{code}
{code:java|title=Serial scan shows 2 rows}
0: jdbc:phoenix:labs-test-namenode-lv-101,labs>  explain SELECT "id_1", "id_2", "id_3",
"id_4", "id_5", "id_6", "id"  FROM "foo"."BIG_TABLE" WHERE "id_1" = 60268375 limit 10;
+-----------------------------------------------------------------------------------------------------------+
|                                                   PLAN                                 
                  |
+-----------------------------------------------------------------------------------------------------------+
| CLIENT 10-CHUNK SERIAL 10-WAY ROUND ROBIN RANGE SCAN OVER foo.BIG_TABLE [0,60268375] - [9,60268375]
 |
|     SERVER FILTER BY FIRST KEY ONLY                                                    
                  |
|     SERVER 10 ROW LIMIT                                                                
                  |
| CLIENT 10 ROW LIMIT                                                                    
                  |
+-----------------------------------------------------------------------------------------------------------+
0: jdbc:phoenix:labs-test-namenode-lv-101,labs> SELECT "id_1", "id_2", "id_3", "id_4",
"id_5", "id_6", "id" FROM "foo"."BIG_TABLE" WHERE "id_1" = 60268375 limit 10;
+-----------+-------------+--------+------------+------------+------------+-------------+
|   id_1   |    id_2    | id_3  |   id_4    |   id_5   |  id_6   |     id      |
+-----------+-------------+--------+------------+------------+------------+-------------+
| 60268375  | 1306309960  | 4      | 365224322  | 359338115  | 523433838  | 1052445538  |
| 60268375  | 1306309959  | 4      | 365224321  | 359338114  | 523433837  | 1052445537  |
+-----------+-------------+--------+------------+------------+------------+-------------+
2 rows selected (0.12 seconds)
{code}
Now the missing row can be retrieved on using full primary key.
{code:java}
0: jdbc:phoenix:labs-test-namenode-lv-101,labs> select "id_1", "id_2", "id_3", "id_4",
"id_5", "id_6", "id" FROM "foo"."BIG_TABLE" WHERE "id_1" IN (60268375) AND "id_2" IN (1306309960)
AND "id_3" = 4 AND "id_4" = 365224322 AND "id_5" = 359338115 AND "id_6" = 523433838 AND "id"
= 1052445538;
+-----------+-------------+--------+------------+------------+------------+-------------+
|   id_1   |    id_2    | id_3  |   id_4    |   id_5   |  id_6   |     id      |
+-----------+-------------+--------+------------+------------+------------+-------------+
| 60268375  | 1306309960  | 4      | 365224322  | 359338115  | 523433838  | 1052445538  |
+-----------+-------------+--------+------------+------------+------------+-------------+
1 row selected (0.07 seconds)
{code}
– Phoenix (version 4.13)
 – HBase (Version 1.2.0)

  was:
Table properties
 * saltbucket_size=10
 * compression=snappy
 * normalization_enabled=true

PRIMARY KEY ("id_1", "id_2", "id_3", "id_4", "id_5", "id_6", "id")

Partial primary key lookup is not working as expected.

On using only one key "id_1" = 60268375; should return 2 rows
{code:java|title=Parallel scan shows 1 rows}
0: jdbc:phoenix:labs-test-namenode-lv-101,labs> explain SELECT "id_1", "id_2", "id_3",
"id_4", "id_5", "id_6", "id" FROM "foo"."BIG_TABLE" WHERE "id_1" = 60268375;
+--------------------------------------------------------------------------------------------------------------------------+
|                                                           PLAN                         
                                 |
+--------------------------------------------------------------------------------------------------------------------------+
| CLIENT 8-CHUNK 0 ROWS 0 BYTES PARALLEL 8-WAY ROUND ROBIN RANGE SCAN OVER foo.BIG_TABLE [0,60268375]
- [9,60268375]  |
|     SERVER FILTER BY FIRST KEY ONLY                                                    
                                 |
+--------------------------------------------------------------------------------------------------------------------------+
2 rows selected (0.047 seconds)
0: jdbc:phoenix:labs-test-namenode-lv-101,labs> SELECT "id_1", "id_2", "id_3", "id_4",
"id_5", "id_6", "id" FROM "foo"."BIG_TABLE" WHERE "id_1" = 60268375;
+-----------+-------------+--------+------------+------------+------------+-------------+
|   id_1   |    id_2    | id_3  |   id_4    |   id_5   |  id_6   |     id      |
+-----------+-------------+--------+------------+------------+------------+-------------+
| 60268375  | 1306309959  | 4      | 365224321  | 359338114  | 523433837  | 1052445537  |
+-----------+-------------+--------+------------+------------+------------+-------------+
1 row selected (3.24 seconds)
{code}
{code:java|title=Serial scan shows 2 rows}
0: jdbc:phoenix:labs-test-namenode-lv-101,labs>  explain SELECT "id_1", "id_2", "id_3",
"id_4", "id_5", "id_6", "id"  FROM "foo"."BIG_TABLE" WHERE "id_1" = 60268375 limit 10;
+-----------------------------------------------------------------------------------------------------------+
|                                                   PLAN                                 
                  |
+-----------------------------------------------------------------------------------------------------------+
| CLIENT 10-CHUNK SERIAL 10-WAY ROUND ROBIN RANGE SCAN OVER foo.BIG_TABLE [0,60268375] - [9,60268375]
 |
|     SERVER FILTER BY FIRST KEY ONLY                                                    
                  |
|     SERVER 10 ROW LIMIT                                                                
                  |
| CLIENT 10 ROW LIMIT                                                                    
                  |
+-----------------------------------------------------------------------------------------------------------+
0: jdbc:phoenix:labs-test-namenode-lv-101,labs> SELECT "id_1", "id_2", "id_3", "id_4",
"id_5", "id_6", "id" FROM "foo"."BIG_TABLE" WHERE "id_1" = 60268375 limit 10;
+-----------+-------------+--------+------------+------------+------------+-------------+
|   id_1   |    id_2    | id_3  |   id_4    |   id_5   |  id_6   |     id      |
+-----------+-------------+--------+------------+------------+------------+-------------+
| 60268375  | 1306309960  | 4      | 365224322  | 359338115  | 523433838  | 1052445538  |
| 60268375  | 1306309959  | 4      | 365224321  | 359338114  | 523433837  | 1052445537  |
+-----------+-------------+--------+------------+------------+------------+-------------+
2 rows selected (0.12 seconds)
{code}
Now the missing row can be retrieved on using full primary key.
{code:java}
0: jdbc:phoenix:labs-test-namenode-lv-101,labs> select "id_1", "id_2", "id_3", "id_4",
"id_5", "id_6", "id" FROM "foo"."BIG_TABLE" WHERE "id_1" IN (60268375) AND "id_2" IN (1306309960)
AND "id_3" = 4 AND "id_4" = 365224322 AND "id_5" = 359338115 AND "id_6" = 523433838 AND "id"
= 1052445538;
+-----------+-------------+--------+------------+------------+------------+-------------+
|   id_1   |    id_2    | id_3  |   id_4    |   id_5   |  id_6   |     id      |
+-----------+-------------+--------+------------+------------+------------+-------------+
| 60268375  | 1306309960  | 4      | 365224322  | 359338115  | 523433838  | 1052445538  |
+-----------+-------------+--------+------------+------------+------------+-------------+
1 row selected (0.07 seconds)
{code}
– Phoenix (version 4.13)
 – HBase (Version 1.2.0)


> Queries with leading part of the primary key doing a parallel scan causes missing data
> --------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-5490
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5490
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Amarnath Ramamoorthi
>            Priority: Major
>         Attachments: foo_bigtable.zip
>
>
> Table properties
>  * saltbucket_size=10
>  * compression=snappy
>  * normalization_enabled=true
>  * SPLIT POLICY = ConstantSizeRegionSplitPolicy
>  * SALT_BUCKETS = 10 (on creation)
>  * ENCODING = FAST_DIFF
> PRIMARY KEY ("id_1", "id_2", "id_3", "id_4", "id_5", "id_6", "id")
> Partial primary key lookup is not working as expected.
> On using only one key "id_1" = 60268375; should return 2 rows
> {code:java|title=Parallel scan shows 1 rows}
> 0: jdbc:phoenix:labs-test-namenode-lv-101,labs> explain SELECT "id_1", "id_2", "id_3",
"id_4", "id_5", "id_6", "id" FROM "foo"."BIG_TABLE" WHERE "id_1" = 60268375;
> +--------------------------------------------------------------------------------------------------------------------------+
> |                                                           PLAN                    
                                      |
> +--------------------------------------------------------------------------------------------------------------------------+
> | CLIENT 8-CHUNK 0 ROWS 0 BYTES PARALLEL 8-WAY ROUND ROBIN RANGE SCAN OVER foo.BIG_TABLE
[0,60268375] - [9,60268375]  |
> |     SERVER FILTER BY FIRST KEY ONLY                                               
                                      |
> +--------------------------------------------------------------------------------------------------------------------------+
> 2 rows selected (0.047 seconds)
> 0: jdbc:phoenix:labs-test-namenode-lv-101,labs> SELECT "id_1", "id_2", "id_3", "id_4",
"id_5", "id_6", "id" FROM "foo"."BIG_TABLE" WHERE "id_1" = 60268375;
> +-----------+-------------+--------+------------+------------+------------+-------------+
> |   id_1   |    id_2    | id_3  |   id_4    |   id_5   |  id_6   |     id      |
> +-----------+-------------+--------+------------+------------+------------+-------------+
> | 60268375  | 1306309959  | 4      | 365224321  | 359338114  | 523433837  | 1052445537
 |
> +-----------+-------------+--------+------------+------------+------------+-------------+
> 1 row selected (3.24 seconds)
> {code}
> {code:java|title=Serial scan shows 2 rows}
> 0: jdbc:phoenix:labs-test-namenode-lv-101,labs>  explain SELECT "id_1", "id_2", "id_3",
"id_4", "id_5", "id_6", "id"  FROM "foo"."BIG_TABLE" WHERE "id_1" = 60268375 limit 10;
> +-----------------------------------------------------------------------------------------------------------+
> |                                                   PLAN                            
                       |
> +-----------------------------------------------------------------------------------------------------------+
> | CLIENT 10-CHUNK SERIAL 10-WAY ROUND ROBIN RANGE SCAN OVER foo.BIG_TABLE [0,60268375]
- [9,60268375]  |
> |     SERVER FILTER BY FIRST KEY ONLY                                               
                       |
> |     SERVER 10 ROW LIMIT                                                           
                       |
> | CLIENT 10 ROW LIMIT                                                               
                       |
> +-----------------------------------------------------------------------------------------------------------+
> 0: jdbc:phoenix:labs-test-namenode-lv-101,labs> SELECT "id_1", "id_2", "id_3", "id_4",
"id_5", "id_6", "id" FROM "foo"."BIG_TABLE" WHERE "id_1" = 60268375 limit 10;
> +-----------+-------------+--------+------------+------------+------------+-------------+
> |   id_1   |    id_2    | id_3  |   id_4    |   id_5   |  id_6   |     id      |
> +-----------+-------------+--------+------------+------------+------------+-------------+
> | 60268375  | 1306309960  | 4      | 365224322  | 359338115  | 523433838  | 1052445538
 |
> | 60268375  | 1306309959  | 4      | 365224321  | 359338114  | 523433837  | 1052445537
 |
> +-----------+-------------+--------+------------+------------+------------+-------------+
> 2 rows selected (0.12 seconds)
> {code}
> Now the missing row can be retrieved on using full primary key.
> {code:java}
> 0: jdbc:phoenix:labs-test-namenode-lv-101,labs> select "id_1", "id_2", "id_3", "id_4",
"id_5", "id_6", "id" FROM "foo"."BIG_TABLE" WHERE "id_1" IN (60268375) AND "id_2" IN (1306309960)
AND "id_3" = 4 AND "id_4" = 365224322 AND "id_5" = 359338115 AND "id_6" = 523433838 AND "id"
= 1052445538;
> +-----------+-------------+--------+------------+------------+------------+-------------+
> |   id_1   |    id_2    | id_3  |   id_4    |   id_5   |  id_6   |     id      |
> +-----------+-------------+--------+------------+------------+------------+-------------+
> | 60268375  | 1306309960  | 4      | 365224322  | 359338115  | 523433838  | 1052445538
 |
> +-----------+-------------+--------+------------+------------+------------+-------------+
> 1 row selected (0.07 seconds)
> {code}
> – Phoenix (version 4.13)
>  – HBase (Version 1.2.0)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message