hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wilfred Spiegelenburg (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HADOOP-11252) RPC client write does not time out by default
Date Fri, 31 Oct 2014 01:57:34 GMT
Wilfred Spiegelenburg created HADOOP-11252:
----------------------------------------------

             Summary: RPC client write does not time out by default
                 Key: HADOOP-11252
                 URL: https://issues.apache.org/jira/browse/HADOOP-11252
             Project: Hadoop Common
          Issue Type: Bug
          Components: ipc
    Affects Versions: 2.5.0
            Reporter: Wilfred Spiegelenburg


The RPC client has a default timeout set to 0 when no timeout is passed in. This means that
the network connection created will not timeout when used to write data. The issue has shown
in YARN-2578 and HDFS-4858. Timeouts for writes then fall back to the tcp level retry (configured
via tcp_retries2) and timeouts between the 15-30 minutes. Which is too long for a default
behaviour.

Using 0 as the default value for timeout is incorrect. We should use a sane value for the
timeout and the "ipc.ping.interval" configuration value is a logical choice for it. The default
behaviour should be changed from 0 to the value read for the ping interval from the Configuration.

Fixing it in common makes more sense than finding and changing all other points in the code
that do not pass in a timeout.

Offending code lines:
https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/RPC.java#L488
and 
https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/RPC.java#L350



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message