phoenix-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-5096) Local index region pruning is not working as expected.
Date Thu, 10 Jan 2019 21:18:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-5096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16739782#comment-16739782
] 

Lars Hofhansl commented on PHOENIX-5096:
----------------------------------------

Ok, so let's try:

create table test (pk1 integer not null, pk2 integer not null, pk3 integer not null, v1 float,
v2 float, v3 integer CONSTRAINT pk PRIMARY KEY (pk1, pk2, pk3));
create local index l1 *on test(pk1, v1)*;

(Same table, but the index defined on (pk1, v1)

{code}
explain select count(*) from test;
+--------------------------------------------------------------------------------------+-----------------+----------------+----------------+
|                                         PLAN                                         | EST_BYTES_READ
 | EST_ROWS_READ  |  EST_INFO_TS   |
+--------------------------------------------------------------------------------------+-----------------+----------------+----------------+
| CLIENT 4-CHUNK 5518821 ROWS 314572800 BYTES PARALLEL 4-WAY RANGE SCAN OVER TEST [1]  | 314572800
      | 5518821        | 1547154736193  |
|     SERVER FILTER BY FIRST KEY ONLY                                                  | 314572800
      | 5518821        | 1547154736193  |
|     SERVER AGGREGATE INTO SINGLE ROW                                                 | 314572800
      | 5518821        | 1547154736193  |
+--------------------------------------------------------------------------------------+-----------------+----------------+----------------+
3 rows selected (0.113 seconds)
{code}

Again, 4 regions.

{code}
0: jdbc:phoenix:localhost> explain select count(*) from test where pk1 < 10000;
+----------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
|                                                PLAN                                    
           | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
+----------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
| CLIENT 1-CHUNK 5518821 ROWS 314572800 BYTES PARALLEL 1-WAY RANGE SCAN OVER TEST [1,*] -
[1,10000]  | 314572800       | 5518821        | 1547154736193  |
|     SERVER FILTER BY FIRST KEY ONLY                                                    
           | 314572800       | 5518821        | 1547154736193  |
|     SERVER AGGREGATE INTO SINGLE ROW                                                   
           | 314572800       | 5518821        | 1547154736193  |
+----------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
3 rows selected (0.072 seconds)
{code}

Again, all pks starting with pk1 < 10000 are in the same region.

{code}
0: jdbc:phoenix:localhost> explain select count(*) from test where v1 = 0.1;
+--------------------------------------------------------------------------------------+-----------------+----------------+----------------+
|                                         PLAN                                         | EST_BYTES_READ
 | EST_ROWS_READ  |  EST_INFO_TS   |
+--------------------------------------------------------------------------------------+-----------------+----------------+----------------+
| CLIENT 4-CHUNK 5518821 ROWS 314572800 BYTES PARALLEL 4-WAY RANGE SCAN OVER TEST [1]  | 314572800
      | 5518821        | 1547154736193  |
|     SERVER FILTER BY FIRST KEY ONLY AND TO_FLOAT("V1") = 0.1                         | 314572800
      | 5518821        | 1547154736193  |
|     SERVER AGGREGATE INTO SINGLE ROW                                                 | 314572800
      | 5518821        | 1547154736193  |
+--------------------------------------------------------------------------------------+-----------------+----------------+----------------+
3 rows selected (0.088 seconds)
{code}

Can no longer use the index with just v1. It's doing a full scan!

{code}
0: jdbc:phoenix:localhost> explain select count(*) from test where v1 = 0.1 and pk1 <
10000;
+----------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
|                                                PLAN                                    
           | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
+----------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
| CLIENT 1-CHUNK 5518821 ROWS 314572800 BYTES PARALLEL 1-WAY RANGE SCAN OVER TEST [1,*] -
[1,10000]  | 314572800       | 5518821        | 1547154736193  |
|     SERVER FILTER BY FIRST KEY ONLY AND TO_FLOAT("V1") = 0.1                           
           | 314572800       | 5518821        | 1547154736193  |
|     SERVER AGGREGATE INTO SINGLE ROW                                                   
           | 314572800       | 5518821        | 1547154736193  |
+----------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
3 rows selected (0.077 seconds)
{code}

Now it's doing a full scan over the first region, still not using the local index. (notice
the filter...)

What I expect is that this uses the local index, but only only consults the regions that can
possibly have any data based on the selection on the pk prefix (pk1 here)


> Local index region pruning is not working as expected.
> ------------------------------------------------------
>
>                 Key: PHOENIX-5096
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5096
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Priority: Major
>
> The pruning of local indexes should do the following:
> * Use the local index
> * Reduce the number of regions based on filters on the table's primary (i.e. WHERE conditions
including prefixes of the primary key)
> Instead it looks like in order for this to work the needed PK column need to be included
in the local index itself, changing the sort order of the local index.
> I'll provide some examples in the comments.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message