hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-916) Rewrite DFSOutputStream to use a single thread with NIO
Date Mon, 01 Feb 2010 21:56:18 GMT

    [ https://issues.apache.org/jira/browse/HDFS-916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12828287#action_12828287

Todd Lipcon commented on HDFS-916:

Patch is up on HDFS-914 - hopefully we can move quickly on that, since it'll have to be redone
if the code moves under it.

> Rewrite DFSOutputStream to use a single thread with NIO
> -------------------------------------------------------
>                 Key: HDFS-916
>                 URL: https://issues.apache.org/jira/browse/HDFS-916
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs client
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
> The DFS write pipeline code has some really hairy multi-threaded synchronization. There
have been a lot of bugs produced by this (HDFS-101, HDFS-793, HDFS-915, tens of others) since
it's very hard to understand the message passing, lock sharing, and interruption properties.
The reason for the multiple threads is to be able to simultaneously send and receive. If instead
of using multiple threads, it used nonblocking IO, I think the whole thing would be a lot
less error prone.
> I think we could do this in two halves: one half is the DFSOutputStream. The other half
is BlockReceiver. I opened this JIRA first as I think it's simpler (only one TCP connection
to deal with, rather than an up and downstream)
> Opinions? Am I crazy? I would like to see some agreement on the idea before I spend time
writing code.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message