hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Abhishek Singh Chouhan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18796) Admin#isTableAvailable returns incorrect result before daughter regions are opened
Date Thu, 21 Sep 2017 13:33:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174755#comment-16174755
] 

Abhishek Singh Chouhan commented on HBASE-18796:
------------------------------------------------

Spent some time looking at the failure. Looks to be a problem elsewhere that surfaced.
The test does a split and then tries a batch get operation which fails due to table not found
although the table is there. This is happening because now that we do not put daughter locations
before they're actually opened on the regionserver, we run into NoServerForRegionException
in ConnectionImplementation#locateRegionInMeta which should be fine since there are retries
which should succeed as soon as the region is opened. However our retry fails on a TableNotFound
exception here

{code}
try (ReversedClientScanner rcs =
            new ReversedClientScanner(conf, s, TableName.META_TABLE_NAME, this, rpcCallerFactory,
                rpcControllerFactory, getMetaLookupPool(), metaReplicaCallTimeoutScanInMicroSecond))
{
          regionInfoRow = rcs.next();
        }
        if (regionInfoRow == null) {
            throw new TableNotFoundException(tableName);
        }
{code}

The result that we get has mayHaveMoreCellsInRow() true during one of the retries, since we
don't have setAllowPartialResults(true) set on our scan we get regionInfoRow as null since
we got only 1 row which has mayHaveMoreCellsInRow() as true and we use 
 CompleteScanResultCache which won't return this to the client. After i do
{code}
s.addFamily(HConstants.CATALOG_FAMILY);
    s.setOneRowLimit();
 + s.setAllowPartialResults(true);
    if (this.useMetaReplicas) {
      s.setConsistency(Consistency.TIMELINE);
    }
{code}
the client is able to ride over the split during its retries and the test passes.
[~tedyu] [~apurtell] This issues seems to be something that can be hit during any other retry
too in locateRegionInMeta when mayHaveMoreCellsInRow() is true for the meta scan and the client
would get TableNotFound and will not retry. I can open another jira for this if this sounds
good.

> Admin#isTableAvailable returns incorrect result before daughter regions are opened
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-18796
>                 URL: https://issues.apache.org/jira/browse/HBASE-18796
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.3.1
>            Reporter: Abhishek Singh Chouhan
>            Assignee: Abhishek Singh Chouhan
>             Fix For: 2.0.0, 3.0.0, 1.4.0, 1.3.2, 1.5.0
>
>         Attachments: HBASE-18796.branch-1.001.patch, HBASE-18796.branch-1.001.patch,
HBASE-18796.branch-1.002.patch, HBASE-18796.branch-1.003.patch, HBASE-18796.master.001.patch
>
>
> Admin#isTableAvailable checks if it can getServerName for the meta entries it reads.
During the time of split server location are added to the meta entries in MetaTableAccessor#splitRegion
although the description of the method says "Does not add the location information to the
daughter regions since they are not open yet.". At this point during the split daughter regions
are not actually open, so we can get to a state where parent is offline, daughters are not
yet open but isTableAvailable returns true.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message