hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gary Helmling (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException
Date Fri, 18 Nov 2016 17:34:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677228#comment-15677228

Gary Helmling commented on HBASE-17114:

-1 to the current patch:

* by default, retries of CQTBE should use the value from hbase.client.pause.  Changing this
to use a different config value by default changes behavior unexpectedly for _all_ users.
 For the average HBase user, if you've already tuned hbase.client.pause and suddenly find
some requests pausing longer than others due to this change, this is a poor experience.
* hbase.client.pause.special does not describe what this actually configures.  Rename it to
hbase.client.pause.callqueuetoobigexception and add it, with no default value, but with a
description, to hbase-default.xml.  This needs to be clearly documented.
* only if hbase.client.pause.callqueuetoobigexception is set should you use this as a "special"
pause for CQTBE, otherwise use hbase.client.pause.  This allows you to configure what you
need in your environment without impacting all other HBase users.
* the added test case looks like it will be extremely sensitive to timing in the test environment
and will likely be flaky on slow or overloaded machines.  I think it would be better to simply
test the calculated pause time for various configs + exceptions instead of trying to do an
end to end test of the actual sleep time.

> Add an option to set special retry pause when encountering CallQueueTooBigException
> -----------------------------------------------------------------------------------
>                 Key: HBASE-17114
>                 URL: https://issues.apache.org/jira/browse/HBASE-17114
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Yu Li
>            Assignee: Yu Li
>         Attachments: HBASE-17114.patch
> As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} instead of dead-wait.
This is good for performance for most cases but might cause a side-effect that if too many
clients connect to the busy RS, that the retry requests may come over and over again and RS
never got the chance for recovering, and the issue will become especially critical when the
target region is META.
> So here in this JIRA we propose to supply some special retry pause for CQTBE in name
of {{hbase.client.pause.special}}, and by default it will be 500ms (5 times of {{hbase.client.pause}}
default value)

This message was sent by Atlassian JIRA

View raw message