phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-4544) Update statistics inconsistent behavior
Date Fri, 08 Jun 2018 00:10:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-4544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505531#comment-16505531
] 

Hudson commented on PHOENIX-4544:
---------------------------------

ABORTED: Integrated in Jenkins build Phoenix-4.x-HBase-0.98 #1915 (See [https://builds.apache.org/job/Phoenix-4.x-HBase-0.98/1915/])
PHOENIX-4544 Update statistics inconsistent behavior (ankitsinghal59: rev c3201a263937a7765877c67c1bd5265516a0a3c9)
* (edit) phoenix-core/src/it/java/org/apache/phoenix/coprocessor/StatisticsCollectionRunTrackerIT.java
* (edit) phoenix-core/src/main/java/org/apache/phoenix/util/ByteUtil.java
* (edit) phoenix-core/src/main/java/org/apache/phoenix/schema/stats/StatisticsCollectionRunTracker.java
* (edit) phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java
* (edit) phoenix-core/src/main/java/org/apache/phoenix/schema/MetaDataClient.java


> Update statistics inconsistent behavior 
> ----------------------------------------
>
>                 Key: PHOENIX-4544
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4544
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 5.0.0
>            Reporter: Romil Choksi
>            Assignee: Ankit Singhal
>            Priority: Major
>             Fix For: 5.0.0, 4.15.0
>
>         Attachments: PHOENIX-4544.patch, PHOENIX-4544_v1.patch
>
>
> Update statistics may not generate the stats information for all dependent indexes. And
this behavior may depend on whether the command executed synchronously or asynchronously.
> I have a table GIGANTIC_TABLE with ~500k rows with global index I1 and local index I2.
> If async is turned on (the default value):
> {noformat}
> 0: jdbc:phoenix:> update statistics GIGANTIC_TABLE ALL;
> No rows affected (0.081 seconds)
> 0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='I1'
AND COLUMN_FAMILY='0';
> +-------------------------------+
> | COUNT(GUIDE_POSTS_ROW_COUNT)  |
> +-------------------------------+
> | 5                             |
> +-------------------------------+
> 1 row selected (0.009 seconds)
> 0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='GIGANTIC_TABLE'
AND COLUMN_FAMILY='0';
> +-------------------------------+
> | COUNT(GUIDE_POSTS_ROW_COUNT)  |
> +-------------------------------+
> | 520                           |
> +-------------------------------+
> 1 row selected (0.014 seconds)
> 0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='GIGANTIC_TABLE'
AND COLUMN_FAMILY='L#0';
> +-------------------------------+
> | COUNT(GUIDE_POSTS_ROW_COUNT)  |
> +-------------------------------+
> | 0                             |
> +-------------------------------+
> 1 row selected (0.008 seconds)
> 0: jdbc:phoenix:>
> {noformat}
> As we can see there is no records for local index I2. But if we run statistics for indexes:
> {noformat}
> 0: jdbc:phoenix:> update statistics GIGANTIC_TABLE INDEX;
> No rows affected (0.036 seconds)
> 0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='GIGANTIC_TABLE'
AND COLUMN_FAMILY='L#0';
> +-------------------------------+
> | COUNT(GUIDE_POSTS_ROW_COUNT)  |
> +-------------------------------+
> | 20                            |
> +-------------------------------+
> 1 row selected (0.007 seconds)
> {noformat}
> the statistic for local index is generated correctly.
> Now we turn async off:
> {noformat}
> 0: jdbc:phoenix:> delete from SYSTEM.STATS;
> 547 rows affected (0.079 seconds)
> 0: jdbc:phoenix:> update statistics GIGANTIC_TABLE ALL;
> 999,998 rows affected (4.671 seconds)
> 0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='GIGANTIC_TABLE'
AND COLUMN_FAMILY='0';
> +-------------------------------+
> | COUNT(GUIDE_POSTS_ROW_COUNT)  |
> +-------------------------------+
> | 520                           |
> +-------------------------------+
> 1 row selected (0.04 seconds)
> 0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='GIGANTIC_TABLE'
AND COLUMN_FAMILY='L#0';
> +-------------------------------+
> | COUNT(GUIDE_POSTS_ROW_COUNT)  |
> +-------------------------------+
> | 20                            |
> +-------------------------------+
> 1 row selected (0.012 seconds)
> 0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='I1'
AND COLUMN_FAMILY='0';
> +-------------------------------+
> | COUNT(GUIDE_POSTS_ROW_COUNT)  |
> +-------------------------------+
> | 0                             |
> +-------------------------------+
> 1 row selected (0.011 seconds)
> {noformat}
> As we can see we got statistics for the table itself and local index. But not for the
global index.
> Moreover, if we try to update statistics for indexes:
> {noformat}
> 0: jdbc:phoenix:> update statistics GIGANTIC_TABLE INDEX;
> 499,999 rows affected (0.332 seconds)
> 0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='I1'
AND COLUMN_FAMILY='0';
> +-------------------------------+
> | COUNT(GUIDE_POSTS_ROW_COUNT)  |
> +-------------------------------+
> | 0                             |
> +-------------------------------+
> 1 row selected (0.009 seconds)
> {noformat}
> So, still no records for global index.
> But if we delete statistics first and run update for indexes:
> {noformat}
> 0: jdbc:phoenix:> delete from SYSTEM.STATS;
> 541 rows affected (0.024 seconds)
> 0: jdbc:phoenix:> update statistics GIGANTIC_TABLE INDEX;
> 999,998 rows affected (0.41 seconds)
> 0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='I1'
AND COLUMN_FAMILY='0';
> +-------------------------------+
> | COUNT(GUIDE_POSTS_ROW_COUNT)  |
> +-------------------------------+
> | 5                             |
> +-------------------------------+
> 1 row selected (0.01 seconds)
> 0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='GIGANTIC_TABLE'
AND COLUMN_FAMILY='L#0';
> +-------------------------------+
> | COUNT(GUIDE_POSTS_ROW_COUNT)  |
> +-------------------------------+
> | 20                            |
> +-------------------------------+
> 1 row selected (0.01 seconds)
> {noformat}
> than we got statistics for both local and global indexes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message