hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hairong Kuang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6713) The RPC server Listener thread is a scalability bottleneck
Date Mon, 26 Apr 2010 21:02:40 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12861108#action_12861108

Hairong Kuang commented on HADOOP-6713:

This is a great idea! Separating "accept" from "read" should also greatly reduce the Connection
reset errors observed at the client when NameNode is busy. Dhruba asked me to review this
patch. So here are a few comments:

1. Please remove the System.out.println or change it to be a log statement;
2. Listener#run() should remove doRead() else branch;
3. Now that accept is done is a separate thread, doAccept() should accept as many as possible
(not limit to up to 10 as in the trunk). Another option is to use a blocking accept channel.
4. Optional: the synchronization between listener thread & read thread is very interesting.
It took me a while to figure out that it works. But it seems to me that the code is hard to
understand and maintain. Another option is that each reader thread maintains a queue of pending
registration channels. After choosing a reader, a listener thread simply adds an accepted
channel into its pending queue and then wakes up the reader thread. Each reader thread  registers
all the pending channels before select().

> The RPC server Listener thread is a scalability bottleneck
> ----------------------------------------------------------
>                 Key: HADOOP-6713
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6713
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.21.0
>            Reporter: dhruba borthakur
>            Assignee: Dmytro Molkov
>         Attachments: HADOOP-6713.patch
> The Hadoop RPC Server implementation has a single Listener thread that reads data from
the socket and puts them into a call queue. This means that this single thread can pull RPC
requests off the network only as fast as a single CPU can execute. This is a scalability bottlneck
in our cluster.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message