phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ethan Wang (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (PHOENIX-2903) Handle split during scan for row key ordered aggregations
Date Sun, 22 Oct 2017 08:17:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-2903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214228#comment-16214228
] 

Ethan Wang edited comment on PHOENIX-2903 at 10/22/17 8:16 AM:
---------------------------------------------------------------

Nvm the question from my last comments. Now I see, during BaseResultIterators, each scan will
be send over to server side to have a scanner prepared. During this process, if split happens,
NSRE will be caught and new scans will be prepared and tried. The process will be recursively
tried until all scanner come back OK (or retriedCount reduced to 0).


After this point, if a split happens, and one or two scanner gets impacted, during TableResultIterator
another NRSE will be caught. in this case it will simply throw out the StaleRegionBoundaryCacheException
exception.  


My understand is that, the first process of above two is what this item is focusing on. 


Don't know [~jamestaylor] maybe you have already tried to reproduced this NSRE for the first
scenario. Basically, at GroupedAggregateRegionObserver, a preSplit and a postCompleteSplit
hook are added. So that when a splits starts or ends, it should be able to locked down with
how aggregation process. So that we can reproduce in this sequence:
   

 PrepareScanner ->  Scanner.next()  -> Splits starts -> Split ends -> ServerSide
Aggregation -> Aggregation finishes


For Ordered, since the actual aggregation *will not start* until the first rs.next() gets
called. (unOrdered, in comparison, will have everything aggregated, under the protection of
regionLock). So after dicussed with [~vincentpoon], a right test case should be to trigger
a split starts after rs.next() gets called, but before the logic of next() gets executed.
Something like


GroupedAggregateRegionObserver
{code:java}
RegionScanner doPostScannerOpen(){
        .....
        return new BaseRegionScanner(scanner) {
            private long rowCount = 0;
            private ImmutableBytesPtr currentKey = null;

            @Override
            public boolean next(List<Cell> results) throws IOException {
                permitToSplit.unlock(); // signal pre-split hook can start splitting now
                splitFinishes.lock();  //wait till split finishes, the continue the rest 
                 .....
{code}

Thoughts?


was (Author: aertoria):
Nvm the question from my last comments. Now I see, during BaseResultIterators, each scan will
be send over to server side to have a scanner prepared. During this process, if split happens,
NSRE will be caught and new scans will be prepared and tried. The process will be recursively
tried until all scanner come back OK (or retriedCount reduced to 0).


After this point, if a split happens, and one or two scanner gets impacted, during TableResultIterator
another NRSE will be caught. in this case it will simply throw out the StaleRegionBoundaryCacheException
exception.  


My understand is that, the first process of above two is what this item is focusing on. 


Don't know [~jamestaylor] maybe you have already tried to reproduced this NSRE for the first
scenario. Basically, at GroupedAggregateRegionObserver, a preSplit and a postCompleteSplit
hook are added. So that when a splits starts or ends, it should be able to locked down with
how aggregation process. So that we can reproduce in this sequence:
   

 PrepareScanner ->  Scanner.next()  -> Splits starts -> Split ends -> ServerSide
Aggregation -> Aggregation finishes


For Ordered, since the actual aggregation *will not start* until the first rs.next() gets
called. (unOrdered, in comparison, will have everything aggregated, under the protection of
regionLock). So after dicussed with [~vincentpoon], I'm thinking a right test case should
be to trigger a split starts after rs.next() gets called, but before the logic of next() gets
executed. Something like


GroupedAggregateRegionObserver
{code:java}
RegionScanner doPostScannerOpen(){
        .....
        return new BaseRegionScanner(scanner) {
            private long rowCount = 0;
            private ImmutableBytesPtr currentKey = null;

            @Override
            public boolean next(List<Cell> results) throws IOException {
                permitToSplit.unlock(); // signal pre-split hook can start splitting now
                splitFinishes.lock();  //wait till split finishes, the continue the rest 
                 .....
{code}

Thoughts?

> Handle split during scan for row key ordered aggregations
> ---------------------------------------------------------
>
>                 Key: PHOENIX-2903
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2903
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>            Assignee: James Taylor
>             Fix For: 4.13.0
>
>         Attachments: PHOENIX-2903_v1.patch, PHOENIX-2903_v2.patch, PHOENIX-2903_v3.patch,
PHOENIX-2903_v4_wip.patch, PHOENIX-2903_v5_wip.patch, PHOENIX-2903_wip.patch
>
>
> Currently a hole in our split detection code



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message