hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-918) Use single Selector and small thread pool to replace many instances of BlockSender for reads
Date Thu, 11 Mar 2010 23:54:27 GMT

    [ https://issues.apache.org/jira/browse/HDFS-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844280#action_12844280
] 

Todd Lipcon commented on HDFS-918:
----------------------------------

bq. I think it is better to use something like Netty rather than worry about all the selector
handling as in this patch.

I started some work on this last month and got a prototype working. The issue is that switching
to Netty is rather wholesale - all of the existing code has to change away from using InputStream/OutputStream
to use the Netty Channels. I managed to write some interfacing code to provide IS/OS interfaces
around the Netty channels and got unit tests to pass, but there's a lot of inefficiency introduced
by my wrappers.

I'll continue to work on that as I have time, but I think this smaller patch is worth considering
in the meantime.

> Use single Selector and small thread pool to replace many instances of BlockSender for
reads
> --------------------------------------------------------------------------------------------
>
>                 Key: HDFS-918
>                 URL: https://issues.apache.org/jira/browse/HDFS-918
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node
>            Reporter: Jay Booth
>             Fix For: 0.22.0
>
>         Attachments: hdfs-918-20100201.patch, hdfs-918-20100203.patch, hdfs-918-20100211.patch,
hdfs-918-20100228.patch, hdfs-918-20100309.patch, hdfs-multiplex.patch
>
>
> Currently, on read requests, the DataXCeiver server allocates a new thread per request,
which must allocate its own buffers and leads to higher-than-optimal CPU and memory usage
by the sending threads.  If we had a single selector and a small threadpool to multiplex request
packets, we could theoretically achieve higher performance while taking up fewer resources
and leaving more CPU on datanodes available for mapred, hbase or whatever.  This can be done
without changing any wire protocols.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message