hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haohui Mai (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5270) Use thread pools in the datenode daemons
Date Fri, 27 Sep 2013 23:52:02 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13780582#comment-13780582
] 

Haohui Mai commented on HDFS-5270:
----------------------------------

Some preliminary results (Running the attached test on a 3-node cluster, time in milliseconds)

_trunk_:

{code}
$ bin/hadoop jar ~/test-concurrent.jar me.haohui.test.TestConcurrentAccess w 320 50000
...
13/09/27 22:14:31 INFO hdfs.DFSClient: Could not complete /user/hortonhm/concurrent/4508 retrying...
...
elapsed time: 307750
{code}

_With the patch_

{code}
$ bin/hadoop jar ~/test-concurrent.jar me.haohui.test.TestConcurrentAccess w 320 50000
...
elapsed time: 16414
{code}

These numbers represent the end-to-end performance of HDFS from a client's prospective.

One thing might worth noting is that if the datanode is fast enough to reply to the namenode,
the client can avoid retrying the complete calls, resulting in 20x improvement on end-to-end
performance.

Though anecdote, the result shows that the performance of a heavy-loaded cluster might benefit
from this patch.
                
> Use thread pools in the datenode daemons
> ----------------------------------------
>
>                 Key: HDFS-5270
>                 URL: https://issues.apache.org/jira/browse/HDFS-5270
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>            Reporter: Haohui Mai
>            Assignee: Haohui Mai
>         Attachments: HDFS-5270.000.patch, TestConcurrentAccess.java
>
>
> The current implementation of the datanode creates a thread when a new request comes
in. This incurs high overheads for the creation / destruction of threads, making the datanode
unstable under high concurrent loads.
> This JIRA proposes to use a thread pool to reduce the overheads.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message