hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-918) Use single Selector and small thread pool to replace many instances of BlockSender for reads
Date Wed, 31 Mar 2010 02:49:27 GMT

    [ https://issues.apache.org/jira/browse/HDFS-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851709#action_12851709
] 

Andrew Purtell commented on HDFS-918:
-------------------------------------

I applied hdfs-918-branch20.2.patch to vanilla 0.20.2 and built a new version with 'ant tar',
then substituted the resulting Hadoop core and test jars for those bundled with HBase 0.20.3,
and built a new version of that with 'ant tar', then built new AMIs using a well tested process
that normally produces working HBase+Hadoop systems. HDFS appears to initialize fine (I see
registration messages in the NN and DN logs) but the DFSClient in the HBase master cannot
bootstrap:
{quote}
2010-03-30 22:31:17,690 INFO org.apache.hadoop.hdfs.DFSClient: Could not obtain block blk_3225003771095476151_1021
from any node:  java.io.IOException: No live nodes contain current block
2010-03-30 22:33:20,698 INFO org.apache.hadoop.hdfs.DFSClient: Could not obtain block blk_3225003771095476151_1021
from any node:  java.io.IOException: No live nodes contain current block
...
{quote}

The EC2 stuff runs at INFO normally because this is about benchmarking, but I can switch to
DEBUG and provide logs if it might be useful. 

> Use single Selector and small thread pool to replace many instances of BlockSender for
reads
> --------------------------------------------------------------------------------------------
>
>                 Key: HDFS-918
>                 URL: https://issues.apache.org/jira/browse/HDFS-918
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node
>            Reporter: Jay Booth
>             Fix For: 0.22.0
>
>         Attachments: hdfs-918-20100201.patch, hdfs-918-20100203.patch, hdfs-918-20100211.patch,
hdfs-918-20100228.patch, hdfs-918-20100309.patch, hdfs-918-branch20.2.patch, hdfs-multiplex.patch
>
>
> Currently, on read requests, the DataXCeiver server allocates a new thread per request,
which must allocate its own buffers and leads to higher-than-optimal CPU and memory usage
by the sending threads.  If we had a single selector and a small threadpool to multiplex request
packets, we could theoretically achieve higher performance while taking up fewer resources
and leaving more CPU on datanodes available for mapred, hbase or whatever.  This can be done
without changing any wire protocols.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message