hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmytro Molkov (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6713) The RPC server Listener thread is a scalability bottleneck
Date Thu, 22 Apr 2010 18:17:52 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859935#action_12859935
] 

Dmytro Molkov commented on HADOOP-6713:
---------------------------------------

We were running performance test on our test cluster.
The test itself is creating a tree of directories with files on the leafs in a depths first
search fashion: there is a root and we create N directories in the root directory for the
test, each mapper then starts in one of those directories and creates its own subtree with
files on the leafs.

Then there is a read job that for each mapper does ls on the directory, chooses random element
in ls, if it is a directory then repeat if it is a file then do read on the file. The files
are 4K in size so the read time is small and we are mostly hitting the namenode with this
job.

We were running the branch that had this fix and it also had read write locks for namenode
instead of synchronized sections.

The version without fixes could only get namenode to use 175% cpu. With fixes in place we
were using 750% cpu for read only load (when the second job was running on its own and 550%
for read-write load when two jobs were running in parallel. 

In the read-write mode the ration of reads to writes was 8:1 (800 read clients vs 100 write
clients).

We are not putting the read-write locks in production in this iteration, seems we feel like
we need to do more testing on it. As soon as I have some results for the branch with this
fix only I will post my findings here.

> The RPC server Listener thread is a scalability bottleneck
> ----------------------------------------------------------
>
>                 Key: HADOOP-6713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6713
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.21.0
>            Reporter: dhruba borthakur
>            Assignee: Dmytro Molkov
>         Attachments: HADOOP-6713.patch
>
>
> The Hadoop RPC Server implementation has a single Listener thread that reads data from
the socket and puts them into a call queue. This means that this single thread can pull RPC
requests off the network only as fast as a single CPU can execute. This is a scalability bottlneck
in our cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message