hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benoit Sigoure (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2121) HBase client doesn't retry the right number of times when a region is unavailable
Date Sun, 14 Nov 2010 06:22:14 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931769#action_12931769

Benoit Sigoure commented on HBASE-2121:

Hey Gary, if you have a multi-threaded HBase app, I recommend you take a look at asynchbase
(https://github.com/stumbleupon/asynchbase).  It's an alternative HBase client that was designed
to be thread-safe and non-blocking from the ground up.

> HBase client doesn't retry the right number of times when a region is unavailable
> ---------------------------------------------------------------------------------
>                 Key: HBASE-2121
>                 URL: https://issues.apache.org/jira/browse/HBASE-2121
>             Project: HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.20.2, 0.90.0
>            Reporter: Benoit Sigoure
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries
retries 10 times (by default).   It ends up calling HConnectionManager$TableServers.locateRegionInMeta,
which retries 10 times on its own.  So the HBase client is effectively retrying 100 times
before giving up, instead of 10 (10 is the default hbase.client.retries.number).
> I'm using hbase trunk HEAD.  I verified this bug is also in 0.20.2.
> Sample call stack:
>  org.apache.hadoop.hbase.client.RegionOfflineException: region offline: mytable,,1263421423787
>  	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:709)
>  	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:640)
>  	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:609)
>  	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionLocation(HConnectionManager.java:430)
>  	at org.apache.hadoop.hbase.client.ServerCallable.instantiateServer(ServerCallable.java:57)
>  	at org.apache.hadoop.hbase.client.ScannerCallable.instantiateServer(ScannerCallable.java:62)
>  	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1047)
>  	at org.apache.hadoop.hbase.client.HTable$ClientScanner.nextScanner(HTable.java:836)
>  	at org.apache.hadoop.hbase.client.HTable$ClientScanner.initialize(HTable.java:756)
>  	at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:354)
>  	at <my application>
> How to reproduce:
> with a trivial HBase client (mine was just trying to scan the table), start the client,
take offline the table the client uses, tell the client to start the scan.  The client will
not give up after 10 attempts, unlike what it's supposed to do.
> If locateRegionInMeta is only ever called from getRegionServerWithRetries, then the fix
is trivial: just remove the retry logic in there.  If it has some other callers who possibly
relied on the retry logic in locateRegionInMeta, then the fix is going to be a bit more involved.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message