hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Julian Zhou (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-9139) Independent timeout configuration for rpc channel between cluster nodes
Date Fri, 09 Aug 2013 08:03:49 GMT

    [ https://issues.apache.org/jira/browse/HBASE-9139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734538#comment-13734538
] 

Julian Zhou commented on HBASE-9139:
------------------------------------

Hi [~nkeywal], attached the trunk patch v0. I searched out all reference places of HBASE_RPC_TIMEOUT_KEY
and "hbase.rpc.timeout". Besides test/ code, only we have HCM and regionserver code to initialize
the rpc timeout value. HCM still base on "hbase.rpc.timeout", so seems we only need to apply
the new conf for regionserver's rpc timeout. So seems the change is straightforward and simple.
Could you help review? Thanks [~nkeywal] and [~lhofhansl].
                
> Independent timeout configuration for rpc channel between cluster nodes
> -----------------------------------------------------------------------
>
>                 Key: HBASE-9139
>                 URL: https://issues.apache.org/jira/browse/HBASE-9139
>             Project: HBase
>          Issue Type: Improvement
>          Components: IPC/RPC, regionserver
>    Affects Versions: 0.94.10, 0.96.0
>            Reporter: Julian Zhou
>            Assignee: Julian Zhou
>            Priority: Minor
>             Fix For: 0.94.12, 0.96.0
>
>         Attachments: 9139-0.94-v0.patch, 9139-trunk-v0.patch
>
>
> Default of "hbase.rpc.timeout" is 60000 ms (1 min). User sometimes
> increase them to a bigger value such as 600000 ms (10 mins) for many
> concurrent loading application from client. Some user share the same
> hbase-site.xml for both client and server. HRegionServer
> #tryRegionServerReport via rpc channel to report to live master, but
> there was a window for master failover scenario. That region server
> attempting to connect to master, which was just killed, backup master
> took the active role immediately and put to /hbase/master, but region
> server was still waiting for the rpc timeout from connecting to the dead
> master. If "hbase.rpc.timeout" is too long, this master failover process
> will be long due to long rpc timeout from dead master.
> If so, could we separate with 2 options, "hbase.rpc.timeout" is still
> for hbase client, while "hbase.rpc.internal.timeout" was for this
> regionserver/master rpc channel, which could be set shorted value
> without affect real client rpc timeout value?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message