phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Taylor (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (PHOENIX-4333) Stats - Incorrect estimate when stats are updated on a tenant specific view
Date Thu, 02 Nov 2017 18:26:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16236115#comment-16236115
] 

James Taylor edited comment on PHOENIX-4333 at 11/2/17 6:25 PM:
----------------------------------------------------------------

We really want to answer the question "Is there a guidepost within every region?". Whether
a guidepost then intersects the scan is not the check we need. For example, you may have a
query doing a skip scan which would fail the intersection test, but still have a guidepost
in the region.

I think if you always set the endRegionKey (instead of only when it's a local index) here
before the inner loop:
{code}
                endRegionKey = regionInfo.getEndKey();
                if (isLocalIndex) {
{code}
and then after the inner loop, check that we set currentKeyBytes (which means we entered the
loop) or that the currentGuidePost is less than the region end key, then that's enough, since
we know that the currentGuidePost is already bigger than the start region key. The check for
endKey == stopKey is a small optimization, since we don't need to do the key comparison again
if that's not the case since we've already done it as we entered the loop (see comment below).
{code}
    // We have a guide post in the region if the above loop was entered
    // or if the current key is less than the region end key (since the loop
    // may not have been entered if our scan end key is smaller than the
    // first guide post in that region).
    gpsAvailableForAllRegions &= 
        currentKeyBytes != initialKeyBytes || 
        ( endKey == stopKey && // If not comparing against region boundary
          ( endRegionKey.length == 0 || // then check if gp is in the region
            currentGuidePost.compareTo(endRegionKey) < 0) );
{code}

Does this not pass all of your tests?


was (Author: jamestaylor):
We really want to answer the question "Is there a guidepost within every region?". Whether
a guidepost then intersects the scan is not the check we need. For example, you may have a
query doing a skip scan which would fail the intersection test, but still have a guidepost
in the region.

I think if you always set the endRegionKey (instead of only when it's a local index) here
before the inner loop:
{code}
                endRegionKey = regionInfo.getEndKey();
                if (isLocalIndex) {
{code}
and then after the inner loop, check that we set currentKeyBytes (which means we entered the
loop) or that the currentGuidePost is less than the region end key, then that's enough, since
we know that the currentGuidePost is already bigger than the start region key. The check for
endKey == stopKey is a small optimization, since we don't need to do the key comparison again
if that's not the case since we've already done it as we entered the loop (see comment below).
{code}
                // We have a guide post in previous region if the above loop was entered
                // or if the current key is less than the region end key (since the loop
                // may not have been entered if our scan end key is smaller than the first
                // guide post in that region
                gpsAvailableForAllRegions &= currentKeyBytes != initialKeyBytes || 
                        (endKey == stopKey && (endRegionKey.length == 0 || currentGuidePost.compareTo(endRegionKey)
< 0));
{code}

Does this not pass all of your tests?

> Stats - Incorrect estimate when stats are updated on a tenant specific view
> ---------------------------------------------------------------------------
>
>                 Key: PHOENIX-4333
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4333
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.12.0
>            Reporter: Mujtaba Chohan
>            Assignee: Samarth Jain
>            Priority: Major
>         Attachments: PHOENIX-4333_test.patch, PHOENIX-4333_v1.patch, PHOENIX-4333_v2.patch
>
>
> Consider two tenants A, B with tenant specific view on 2 separate regions/region servers.
> {noformat}
> Region 1 keys:
> A,1
> A,2
> B,1
> Region 2 keys:
> B,2
> B,3
> {noformat}
> When stats are updated on tenant A view. Querying stats on tenant B view yield partial
results (only contains stats for B,1) which are incorrect even though it shows updated timestamp
as current.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message