hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-3357) DataXceiver reads from client socket with incorrect/no timeout
Date Thu, 03 May 2012 21:18:48 GMT

     [ https://issues.apache.org/jira/browse/HDFS-3357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Todd Lipcon updated HDFS-3357:

    Attachment: hdfs-3357.txt

Attached patch adds the comment suggested by Eli above. Waiting on HADOOP-8350 to submit to
> DataXceiver reads from client socket with incorrect/no timeout
> --------------------------------------------------------------
>                 Key: HDFS-3357
>                 URL: https://issues.apache.org/jira/browse/HDFS-3357
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node
>    Affects Versions: 1.0.2, 2.0.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>         Attachments: hdfs-3357.txt, hdfs-3357.txt
> In DataXceiver, we currently use Socket.setSoTimeout to try to manage the read timeout
when switching between reading the initial opCode, reading a keepalive opcode, and reading
the status after a successfully sent block. However, since all of these reads use the same
underlying DataInputStream, the change to the socket timeout isn't respected. Thus, they all
occur with whatever timeout is set on the socket at the time of DataXceiver construction.
In practice this turns out to be 0, which can cause infinitely hung xceivers.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message