hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11667) Simplify ClientScanner logic for NSREs.
Date Tue, 05 Aug 2014 04:18:12 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14085770#comment-14085770
] 

Lars Hofhansl commented on HBASE-11667:
---------------------------------------

Ah... Now I got a run with
{code}
Failed tests:   testScansWithSplits(org.apache.hadoop.hbase.client.TestFromClientSide): expected:<7733>
but was:<7743>
{code}

OK... I buy that there is an issue. Although I do not understand why. An RPC either fails
or it doesn't. There is no notion of partial RPC.
Say the scan RPC with 'aaa' fails with an NSRE. Then no rows of that RPC will be returned
and it can be retried. With scanner caching = 1, the next RPC would then try with 'bbb', which
would also either succeed or fail. The scanner cannot fail partially *during* an RPC.

> Simplify ClientScanner logic for NSREs.
> ---------------------------------------
>
>                 Key: HBASE-11667
>                 URL: https://issues.apache.org/jira/browse/HBASE-11667
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.99.0, 2.0.0, 0.94.23, 0.98.6
>
>         Attachments: 11667-0.94.txt, 11667-trunk.txt, HBASE-11667-0.98.patch, IntegrationTestBigLinkedListWithRegionMovement.patch
>
>
> We ran into an issue with Phoenix where a RegionObserver coprocessor intercepts a scan
and returns an aggregate (in this case a count) with a fake row key. It turns out this does
not work when the {{ClientScanner}} encounters NSREs, as it uses the last key it saw to reset
the scanner to try again (which in this case would be the fake key).
> While this is arguably a rare case and one could also argue that a region observer just
shouldn't do this... While looking at {{ClientScanner}}'s code I found this logic not necessary.
> A NSRE occurred because we contacted a region server with a key that it no longer hosts.
This is the start key, so it is always correct to retry with this same key. That simplifies
the ClientScanner logic and also make this sort of coprocessors possible,



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message