hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13262) ResultScanner doesn't return all rows in Scan
Date Tue, 17 Mar 2015 23:36:39 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14366310#comment-14366310

Josh Elser commented on HBASE-13262:

Kudos on tracking this down!

bq. The net effect is that the client checks its size limit and sees that the limit has not
been reached, so it assumes that the region has been exhausted and moves the scanner to the
next region... so as Josh Elser predicted, the root cause is that we jump between regions
too soon....

Bingo. I just got to this point as well.

        } while (remainingResultSize > 0 && countdown > 0
            && (!partialResults.isEmpty() || possiblyNextScanner(countdown, values
== null)));

One important thing that I think I've convinced myself of is that this also only happens when
there are no queued partial results in the client as well (as the presence of the partial
will also force the client to talk to the same region again).

I'll take a look at the stuff you attached (again, much appreciated) and see if I can chow
down on the rest of your analysis and merge that in with what I (think) I figured out.

> ResultScanner doesn't return all rows in Scan
> ---------------------------------------------
>                 Key: HBASE-13262
>                 URL: https://issues.apache.org/jira/browse/HBASE-13262
>             Project: HBase
>          Issue Type: Bug
>          Components: Client
>    Affects Versions: 2.0.0, 1.1.0
>         Environment: Single node, pseduo-distributed 1.1.0-SNAPSHOT
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>            Priority: Blocker
>             Fix For: 2.0.0, 1.1.0
>         Attachments: testrun_0.98.txt, testrun_branch1.0.txt
> Tried to write a simple Java client again 1.1.0-SNAPSHOT.
> * Write 1M rows, each row with 1 family, and 10 qualifiers (values [0-9]), for a total
of 10M cells written
> * Read back the data from the table, ensure I saw 10M cells
> Running it against {{04ac1891}} (and earlier) yesterday, I would get ~20% of the actual
rows. Running against 1.0.0, returns all 10M records as expected.
> [Code I was running|https://github.com/joshelser/hbase-hwhat/blob/master/src/main/java/hbase/HBaseTest.java]
for the curious.

This message was sent by Atlassian JIRA

View raw message