hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gary Helmling (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException
Date Thu, 17 Nov 2016 20:14:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15674690#comment-15674690

Gary Helmling commented on HBASE-17114:

bq. AFAICS we're already doing this in ClientExceptionsUtil#isMetaClearingException and treated
CQTBE/RegionTooBusyException etc. as special exceptions:

It's only special in the sense that it should not clear the client meta cache.  I don't think
that implies it should use a different retry pause.

bq. Agree this is another good way to handle this, but by default we are still using NoBackoffPolicy

No, a number of places use ConnectionUtils.getPauseTime() which uses an exponential backoff.
 Maybe this has changed in master with consolidating use of AsyncProcess, but that would be
an unexpected change in behavior.

I'm -1 on using a special unique pause time for CQTBE by default.  I think it should use the
configured pause time by default.  If you want to make this overridable for some exception
types, that seems ok, but in that case the config property for overriding the value should
be more closely tied to the exception.  As a user of HBase, there's no way I would know what
"hbase.client.pause.special" means and why it is different.

> Add an option to set special retry pause when encountering CallQueueTooBigException
> -----------------------------------------------------------------------------------
>                 Key: HBASE-17114
>                 URL: https://issues.apache.org/jira/browse/HBASE-17114
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Yu Li
>            Assignee: Yu Li
>         Attachments: HBASE-17114.patch
> As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} instead of dead-wait.
This is good for performance for most cases but might cause a side-effect that if too many
clients connect to the busy RS, that the retry requests may come over and over again and RS
never got the chance for recovering, and the issue will become especially critical when the
target region is META.
> So here in this JIRA we propose to supply some special retry pause for CQTBE in name
of {{hbase.client.pause.special}}, and by default it will be 500ms (5 times of {{hbase.client.pause}}
default value)

This message was sent by Atlassian JIRA

View raw message