hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-12841) ClientBackoffPolicies should support immediate rejection of submitted ops
Date Thu, 22 Jan 2015 19:01:34 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14287986#comment-14287986

Andrew Purtell commented on HBASE-12841:

Discussing this while walking around downtown SF with [~jesse_yates], he had a good simplifying
suggestion: Rather than probabilistically decide to drop any random request, leading to ordering
issues the application would need to deal with, as mentioned above, decide to reject all new
ops to the overloaded server. Let me expand on this a bit: We would do so for a calculated
duration. This would be very similar to how client backoff policies work today, but instead
of submitting a DelayingRunner, we simply reject all ops. We could simply throw RegionTooBusyExceptions
up to the application based on this local decision. 

> ClientBackoffPolicies should support immediate rejection of submitted ops
> -------------------------------------------------------------------------
>                 Key: HBASE-12841
>                 URL: https://issues.apache.org/jira/browse/HBASE-12841
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
> The ClientBackoffPolicy interface currently has a single method:
> {code}
> public interface ClientBackoffPolicy {
>   public long getBackoffTime(ServerName serverName, byte[] region, ServerStatistics stats);
> }
> {code}
> A backoff policy can only specify the amount of delay to inject before submitting the
request(s) to a given server. 
> How that works in the current implementation is we will submit runnables to AsyncProcess
that sleep for the specified delay period before proceeding. This consumes task slots that
could otherwise be performing useful work. AsyncProcess limits the number of outstanding tasks
per region to "hbase.client.max.perregion.tasks" (default 1) and per server "hbase.client.max.perserver.tasks"
(default 2). Tasks will be accepted and queued up to "hbase.client.max.total.tasks" (default
100), after which we start globally blocking submissions by waiting on a monitor.
> Sophisticated applications could benefit from an alternate strategy that immediately
rejects new work. Rather then returning a backoff interval, the policy could return a value
from 0.0 to 1.0, or as percentage from 0 to 100, expressing the likelihood of task rejection.
Immediately rejected tasks won't consume task slots nor "stall" by sleeping. Overall the client
will be less likely to hit the global limit. Applications using APIs like Table#batch or Table#batchCallback
will get control back faster, can determine what operations were failed by pushback, and deal
intelligently with request ordering and resubmission/retry concerns. In network queuing this
strategy is known as Random Early Drop (or Random Early Detection).

This message was sent by Atlassian JIRA

View raw message