hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Created: (HDFS-916) Rewrite DFSOutputStream to use a single thread with NIO
Date Sat, 23 Jan 2010 00:35:21 GMT
Rewrite DFSOutputStream to use a single thread with NIO
-------------------------------------------------------

                 Key: HDFS-916
                 URL: https://issues.apache.org/jira/browse/HDFS-916
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: hdfs client
    Affects Versions: 0.22.0
            Reporter: Todd Lipcon
            Assignee: Todd Lipcon


The DFS write pipeline code has some really hairy multi-threaded synchronization. There have
been a lot of bugs produced by this (HDFS-101, HDFS-793, HDFS-915, tens of others) since it's
very hard to understand the message passing, lock sharing, and interruption properties. The
reason for the multiple threads is to be able to simultaneously send and receive. If instead
of using multiple threads, it used nonblocking IO, I think the whole thing would be a lot
less error prone.

I think we could do this in two halves: one half is the DFSOutputStream. The other half is
BlockReceiver. I opened this JIRA first as I think it's simpler (only one TCP connection to
deal with, rather than an up and downstream)

Opinions? Am I crazy? I would like to see some agreement on the idea before I spend time writing
code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message