hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ryan rawson (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-918) Use single Selector and small thread pool to replace many instances of BlockSender for reads
Date Thu, 04 Feb 2010 01:26:27 GMT

    [ https://issues.apache.org/jira/browse/HDFS-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12829368#action_12829368
] 

ryan rawson commented on HDFS-918:
----------------------------------

the problem was we were using a stateful interface previously because it was faster in scan
tests, so we serialized reads within 1 RS to any given HFile.  With multiple client handler
threads asking for different parts of a large file, we get a serialized behaviour which hurts
random get performance.

So we are moving back to pread, which means we will get more parallelism - depending your
table read pattern of course.  But I want to get even more parallelism, by preading multiple
hfiles during a scan/get for example.  This will just up the thread pressure on the datanode.

> Use single Selector and small thread pool to replace many instances of BlockSender for
reads
> --------------------------------------------------------------------------------------------
>
>                 Key: HDFS-918
>                 URL: https://issues.apache.org/jira/browse/HDFS-918
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node
>            Reporter: Jay Booth
>             Fix For: 0.22.0
>
>         Attachments: hdfs-918-20100201.patch, hdfs-918-20100203.patch, hdfs-multiplex.patch
>
>
> Currently, on read requests, the DataXCeiver server allocates a new thread per request,
which must allocate its own buffers and leads to higher-than-optimal CPU and memory usage
by the sending threads.  If we had a single selector and a small threadpool to multiplex request
packets, we could theoretically achieve higher performance while taking up fewer resources
and leaving more CPU on datanodes available for mapred, hbase or whatever.  This can be done
without changing any wire protocols.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message