hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Boudnik (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-4646) createNNProxyWithClientProtocol ignores configured timeout value
Date Fri, 05 Apr 2013 20:35:17 GMT

     [ https://issues.apache.org/jira/browse/HDFS-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Konstantin Boudnik updated HDFS-4646:

    Fix Version/s: 2.0.5-beta
> createNNProxyWithClientProtocol ignores configured timeout value
> ----------------------------------------------------------------
>                 Key: HDFS-4646
>                 URL: https://issues.apache.org/jira/browse/HDFS-4646
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 3.0.0, 2.0.3-alpha, 2.0.4-alpha
>         Environment: Linux
>            Reporter: Jagane Sundar
>            Priority: Minor
>             Fix For: 3.0.0, 2.0.5-beta, 2.0.4-alpha
>         Attachments: HDFS-4646.001.patch, HDFS-4646.patch
> The Client RPC I/O timeout mechanism appears to be configured by two core-site.xml paramters:
> 1. A boolean ipc.client.ping
> 2. A numeric value ipc.ping.interval
> If ipc.client.ping is true, then we send a RPC ping every ipc.ping.interval milliseconds
> If ipc.client.ping is false, then ipc.ping.interval turns into the socket timeout value.
> The bug here is that while creating a Non HA proxy, the configured timeout value is ignored,
and 0 is passed in. 0 is taken to mean 'wait forever' and the client RPC socket never times
> Note that this bug is reproducible only in the case where the NN machine dies, i.e. the
TCP stack with the NN IP address stops responding completely. The code does not take this
path when you do a 'kill -9' of the NN process, since there is a TCP stack that is alive and
sends out a TCP RST to the client, and that results in a socket error (not a timeout).
> The fix is to pass in the correct configured value for timeout by calling Client.getTimeout(conf)
instead of passing in 0.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message