hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Created: (HDFS-915) Hung DN stalls write pipeline for far longer than its timeout
Date Sat, 23 Jan 2010 00:02:21 GMT
Hung DN stalls write pipeline for far longer than its timeout
-------------------------------------------------------------

                 Key: HDFS-915
                 URL: https://issues.apache.org/jira/browse/HDFS-915
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: hdfs client
    Affects Versions: 0.20.1
            Reporter: Todd Lipcon


After running kill -STOP on the datanode in the middle of a write pipeline, the client takes
far longer to recover than it should. The ResponseProcessor times out in the correct interval,
but doesn't interrupt the DataStreamer, which appears to not be subject to the same timeout.
The client only recovers once the OS actually declares the TCP stream dead, which can take
a very long time.

I've experienced this on 0.20.1, haven't tried it yet on trunk or 0.21.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message