phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mujtaba Chohan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-4287) Incorrect aggregate query results when stats are disable for parallelization
Date Thu, 02 Nov 2017 18:39:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16236356#comment-16236356
] 

Mujtaba Chohan commented on PHOENIX-4287:
-----------------------------------------

[~samarthjain] With {{ALTER ...SET USE_STATS_FOR_PARALLELIZATION=false}} on base table and
also config set to false globally, stats are correctly not used for parallelization when query
runs on base table however on for index it is still used. See explain plan below. This is
with https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commit;h=6e80b0fb0386c48c0837d73d72dd4aee1ca15c4a

{noformat}
ALTER TABLE T SET USE_STATS_FOR_PARALLELIZATION=false;
explain select count(*) from T;
+----------------------------------------------------------------------------------+-----------------+----------------+----------------+
|                                       PLAN                                       | EST_BYTES_READ
 | EST_ROWS_READ  |  EST_INFO_TS   |
+----------------------------------------------------------------------------------+-----------------+----------------+----------------+
| CLIENT 11277-CHUNK 1161114 ROWS 63050353 BYTES PARALLEL 1-WAY FULL SCAN OVER T_IDX  | 63050353
       | 1161114        | 1509646993152  |
|     SERVER FILTER BY FIRST KEY ONLY                                              | 63050353
       | 1161114        | 1509646993152  |
|     SERVER AGGREGATE INTO SINGLE ROW                                             | 63050353
       | 1161114        | 1509646993152  |
+----------------------------------------------------------------------------------+-----------------+----------------+----------------+
{noformat}

> Incorrect aggregate query results when stats are disable for parallelization
> ----------------------------------------------------------------------------
>
>                 Key: PHOENIX-4287
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4287
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.12.0
>         Environment: HBase 1.3.1
>            Reporter: Mujtaba Chohan
>            Assignee: Samarth Jain
>            Priority: Major
>              Labels: localIndex
>             Fix For: 4.13.0, 4.12.1
>
>         Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, PHOENIX-4287_addendum2.patch,
PHOENIX-4287_addendum3.patch, PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch,
PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query returns
incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +-------------------------------------------------------------------------------------------------------+-----------------+----------------+-----------+
> |                                                 PLAN                              
                   | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +-------------------------------------------------------------------------------------------------------+-----------------+----------------+-----------+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER TABLE_T [1]
 | 625043899       | 332170         | 150792825 |
> |     SERVER FILTER BY FIRST KEY ONLY                                               
                   | 625043899       | 332170         | 150792825 |
> |     SERVER AGGREGATE INTO SINGLE ROW                                              
                   | 625043899       | 332170         | 150792825 |
> +-------------------------------------------------------------------------------------------------------+-----------------+----------------+-----------+
> select count(*) from TABLE_T;
> +-----------+
> | COUNT(1)  |
> +-----------+
> | 0         |
> +-----------+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
> |                                               PLAN                                
              | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER TABLE_T  |
438492470       | 332151         | 1507928257617  |
> |     SERVER FILTER BY FIRST KEY ONLY                                               
              | 438492470       | 332151         | 1507928257617  |
> |     SERVER AGGREGATE INTO SINGLE ROW                                              
              | 438492470       | 332151         | 1507928257617  |
> +--------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +-----------+
> | COUNT(1)  |
> +-----------+
> | 14        |
> +-----------+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +----------------------------------------------------------------------+-----------------+----------------+--------------+
> |                                 PLAN                                 | EST_BYTES_READ
 | EST_ROWS_READ  | EST_INFO_TS  |
> +----------------------------------------------------------------------+-----------------+----------------+--------------+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null            | null    
      | null         |
> |     SERVER FILTER BY FIRST KEY ONLY                                  | null       
    | null           | null         |
> |     SERVER AGGREGATE INTO SINGLE ROW                                 | null       
    | null           | null         |
> +----------------------------------------------------------------------+-----------------+----------------+--------------+
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +-----------+
> | COUNT(1)  |
> +-----------+
> | 333327    |
> +-----------+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message