phoenix-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vlad Krava (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-5287) incorrect results on COUNT(*) or COUNT(1) with GLOBAL INDEX
Date Fri, 17 May 2019 21:12:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16842577#comment-16842577
] 

Vlad Krava commented on PHOENIX-5287:
-------------------------------------

*IMHO:*

I have a feeling that this issue is related to Phoenix Global Index out of Sync with main
table ticket(s): 
 # PHOENIX-5211
 # PHOENIX-3845

> incorrect results on COUNT(*) or COUNT(1) with GLOBAL INDEX
> -----------------------------------------------------------
>
>                 Key: PHOENIX-5287
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5287
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 5.0.0
>         Environment: Environment and data information:
>  * Column structure of TABLE_A is fully identical to TABLE_B
>  * TABLE_A has a GLOBAL INDEX
>  * TABLE_B has a LOCAL INDEX
>            Reporter: Vlad Krava
>            Priority: Blocker
>
> COUNT(\*\) and COUNT(1) commands display incorrect(outdated) statistics for table with
GLOBAL index.
> *Example:*
>  * Export TABLE_A to SCV file (SELECT * FROM *POMG.TABLE_A*)
>  * Import CSV file to TABLE_B
>  * COUNT operation on 'TABLE_A' was constantly returning with an amount of 218623 (for
2 days without any data modifications!!!) :
>  ** 0: *jdbc:phoenix:> select count(1) from POMG.TABLE_A*;
>  *** RESULT: 218623
>  * Newly exported table from CSV file (TABLE_B) showed different (higher amount of records)
>  ** 0: *jdbc:phoenix:> select count(1) from POMG.TABLE_B*;
>  *** RESULT: 218683
>  * COUNT in Hbase is returning the bigger value than COUNT comparing to Phoenix table
( 218683 vs 218623)
>  * Phoenix Statistics for this table was updated few time for the past few testing days
>  * I took few attends to define data misalignments by executing diff for primary keys:
>  ** select key_1 from *POMG.TABLE_A* where key_1 not in (select key_1 from *POMG.TABLE_B*)
** - 0 records selected (_Doesn't make sense considering a fact that TABLE_A larger than
TABLE_B and key_1 is unique PRIMARY KEY_)
>  ** select key_1 from *POMG.TABLE_B* where key_1 not in (select key_1 from *POMG.TABLE_A*)
** - 23 records selected (_Doesn't make sense considering a fact that TABLE_A larger than
TABLE_B and key_1 is unique PRIMARY KEY_)
> *Workaround:*
>  * **After executing ALTER INDEX with REBUILD flag COUNT statistics for TABLE_A become
identical to TABLE_B
>  * Diff selects didn't show any differences between  *POMG.TABLE_A* and  *  **POMG.TABLE_B***
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message