hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-12075) Preemptive Fast Fail
Date Mon, 06 Oct 2014 06:15:34 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160038#comment-14160038

Lars Hofhansl commented on HBASE-12075:

>From the description this is an implementation of the circuit breaker pattern, right?
Anecdotically we had that implemented here as well, but found it problematic for various reason
and have since replaced with a resource counter/limiter. I.e. via a simple semaphore and an
acquire/release protocol we simply limit the number of threads that use a resource (HTable,
HConnection, PhoenixConnection) to a number that is acceptable to us.

CircuitBreaker was problematic for various reasons:
# needed to absolutely sure this is a non-recoverable problem
# what if only a few region servers have issues (a) (now need to group exception by region
server) in order to decide we need to fail other connection
# what if only a few region servers have issues (b) - cluster is not down, yet, client threads
will hang
# apps created grouping constructs over HTable/HConnection (Phoenix in our case), now the
circuit breaker got in the way at the wrong times, we need to pull it up higher
# (there were more issues, these are off the top of my head)

> Preemptive Fast Fail
> --------------------
>                 Key: HBASE-12075
>                 URL: https://issues.apache.org/jira/browse/HBASE-12075
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Client
>    Affects Versions: 0.99.0, 2.0.0,
>            Reporter: Manukranth Kolloju
>            Assignee: Manukranth Kolloju
>         Attachments: 0001-Add-a-test-case-for-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch,
0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-Implement-Preemptive-Fast-Fail.patch,
0001-Implement-Preemptive-Fast-Fail.patch, 0001-Implement-Preemptive-Fast-Fail.patch, 0001-Implement-Preemptive-Fast-Fail.patch,
> In multi threaded clients, we use a feature developed on 0.89-fb branch called Preemptive
Fast Fail. This allows the client threads which would potentially fail, fail fast. The idea
behind this feature is that we allow, among the hundreds of client threads, one thread to
try and establish connection with the regionserver and if that succeeds, we mark it as a live
node again. Meanwhile, other threads which are trying to establish connection to the same
server would ideally go into the timeouts which is effectively unfruitful. We can in those
cases return appropriate exceptions to those clients instead of letting them retry.

This message was sent by Atlassian JIRA

View raw message