hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Manukranth Kolloju (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-12075) Preemptive Fast Fail
Date Fri, 26 Sep 2014 21:51:35 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14150055#comment-14150055

Manukranth Kolloju commented on HBASE-12075:

I will change it to getFailureInfo()
I used HostAndPort because the ServerName will also include the start code. The original implementation
user HServerAddress instead of HServerName.
ServerName would probably hurt here because the code tries to clear PFFE for the servers who
have come back from death/near death. So, now when the server has come back from death, then
the servername would be different and so the earlier servername would remain in the failuresMap.

But, on the bright side, having the dead server name in the failuresMap is not going to be
harmful because we have a periodic cleanup that goes and deletes the servers listed in the

So, let me change the code to reflect ServerName so that I don't have to convert ServerName
to HostAndPort every we enter PFFInterceptor. I will add a couple of unit tests for the PreemptiveFastFailInterceptor
and resubmit the patch.

> Preemptive Fast Fail
> --------------------
>                 Key: HBASE-12075
>                 URL: https://issues.apache.org/jira/browse/HBASE-12075
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Client
>    Affects Versions: 1.0.0
>            Reporter: Manukranth Kolloju
>            Assignee: Manukranth Kolloju
>             Fix For: 1.0.0
>         Attachments: 0001-Add-a-test-case-for-Preemptive-Fast-Fail.patch, 0001-Implement-Preemptive-Fast-Fail.patch,
> In multi threaded clients, we use a feature developed on 0.89-fb branch called Preemptive
Fast Fail. This allows the client threads which would potentially fail, fail fast. The idea
behind this feature is that we allow, among the hundreds of client threads, one thread to
try and establish connection with the regionserver and if that succeeds, we mark it as a live
node again. Meanwhile, other threads which are trying to establish connection to the same
server would ideally go into the timeouts which is effectively unfruitful. We can in those
cases return appropriate exceptions to those clients instead of letting them retry.

This message was sent by Atlassian JIRA

View raw message