hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harsh J (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-7397) Allow configurable timeouts when connecting to HDFS via java FileSystem API
Date Sat, 22 Sep 2012 18:39:08 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-7397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Harsh J updated HADOOP-7397:

    Attachment: HADOOP-7397.patch


bq. define a key for the max #of connection retries too, rather than the hard coded 45 value
which is there now (I think that may be a new feature of 0.23+)

This is already present today via ipc.client.connect.max.retries and ipc.client.connect.max.retries.on.timeouts.

[~scottfines] - Patch looks good enough to me. I rebased it to trunk and made it fit more
into the style we use configs now as.
> Allow configurable timeouts when connecting to HDFS via java FileSystem API
> ---------------------------------------------------------------------------
>                 Key: HADOOP-7397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7397
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.23.0
>         Environment: Any
>            Reporter: Scott Fines
>            Priority: Minor
>              Labels: hadoop
>         Attachments: HADOOP-7397.patch, timeout.patch
> If the NameNode is not available (in, for example, a network partition event separating
the client from the NameNode), and an attempt is made to connect, then the FileSystem api
will *eventually* timeout and throw an error. However, that timeout is currently hardcoded
to be 20 seconds to connect, with 45 retries, for a total of a 15 minute wait before failure.
While in many circumstances this is fine, there are also many circumstances (such as booting
a service) where both the connection timeout and the number of retries should be significantly
less, so as not to harm availability of other services.
> Investigating Client.java, I see that there are two fields in Connection: maxRetries
and rpcTimeout. I propose either re-using those fields for initiating the connection as well;
alternatively, using the already existing dfs.socket.timeout parameter to set the connection
timeout on initialization, and potentially adding a new field such as dfs.connection.retries
with a default of 45 to replicate current behaviors.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message