hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
Date Fri, 26 Aug 2016 11:02:20 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15438800#comment-15438800
] 

Enis Soztutar commented on HBASE-16345:
---------------------------------------

bq. The cause is that for the primary replica, if its retry is exhausted too fast, f.get()
[1] returns ExecutionException. This Exception needs to be ignored and continue with the replicas.
Agreed. However, with the default settings, we retry 35 times, with sleeping 100ms between
each attempt. The first timeout before we send the replica requests is 10 - 100ms, which makes
it very unlikely to exhaust the retries from primary before sending the replica RPCs. However,
with very limited number of retries and longer timeout for primary request this can indeed
happen. 

> RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-16345
>                 URL: https://issues.apache.org/jira/browse/HBASE-16345
>             Project: HBase
>          Issue Type: Bug
>          Components: Client
>    Affects Versions: 2.0.0
>            Reporter: huaxiang sun
>            Assignee: huaxiang sun
>         Attachments: HBASE-16345-v001.patch, HBASE-16345.master.001.patch, HBASE-16345.master.002.patch
>
>
> Update for the description. Debugged more at this front based on the comments from Enis.

> The cause is that for the primary replica, if its retry is exhausted too fast, f.get()
[1] returns ExecutionException. This Exception needs to be ignored and continue with the replicas.
> The other issue is that after adding calls for the replicas, if the first completed task
gets ExecutionException (due to the retry exhausted), it throws the exception to the client[2].
> In this case, it needs to loop through these tasks, waiting for the success one. If no
one succeeds, throw exception.
> Similar for the scan as well
> [1] https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197
> [2] https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message