hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Luke Lu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-9956) RPC listener inefficiently assigns connections to readers
Date Fri, 27 Sep 2013 16:48:03 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13780082#comment-13780082
] 

Luke Lu commented on HADOOP-9956:
---------------------------------

Closing idle connections might be the only option, if you don't want client to DoS server
trivially, accidental or not, by opening too many idle connections. If an application protocol
cares about idempotence, the application should handle it, i.e., we should fix job client
to avoid submitting duplicate jobs. Otherwise many network issues will cause the same problem.
We can even make it a little more client friendly by respond with an empty RPC frame with
a busy code before closing the connection.
                
> RPC listener inefficiently assigns connections to readers
> ---------------------------------------------------------
>
>                 Key: HADOOP-9956
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9956
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: ipc
>    Affects Versions: 2.0.0-alpha, 3.0.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>         Attachments: HADOOP-9956.patch
>
>
> The socket listener and readers use a complex synchronization to update the reader's
NIO {{Selector}}.  Updating active selectors is not thread-safe so precautions are required.
> However, the current locking choreography results in a serialized distribution of new
connections to the parallel socket readers.  A slower/busier reader can stall the listener
and throttle performance.
> The problem manifests as unexpectedly low cpu utilization by the listener and readers
(~20-30%) under heavy load.  The call queue is shallow when it should be overflowing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message