hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3109) RPC should accepted connections even when rpc queue is full (ie undo part of HADOOP-2910)
Date Mon, 31 Mar 2008 17:42:24 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583768#action_12583768

Doug Cutting commented on HADOOP-3109:

> What should be a good backlog size?

Perhaps this should be proportional to call queue?  Currently we queue 100 calls per handler
with 10 handlers, or 1000 by default.  The backlog is currently 128.  So setting the backlog
to the call queue length would make it 1000 by default.  Folks with large clusters increase
the number of handlers to 50 or so, so they'd get a backlog of 5000.  Does that sound like
enough, or should we use a multiple of this?

> RPC should accepted connections even when rpc queue is full (ie undo part of HADOOP-2910)
> -----------------------------------------------------------------------------------------
>                 Key: HADOOP-3109
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3109
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Sanjay Radia
>            Assignee: Hairong Kuang
>            Priority: Blocker
>             Fix For: 0.17.0
> HADOOP-2910 changed HDFS to stop accepting new connections when the rpc queue is full.
It should continue to accept connections and let the OS  deal with limiting connections.
> HADOOP-2910's decision to not read from open sockets when queue is full is exactly right
-  backup on the
> client sockets and they will just wait( especially with HADOOP-2188 that removes client
> However we should continue to  accept connections:
> The OS refuses new connections after a large number of connections are open (this is
configurable parameter). With this patch, we have  new lower limit for # of open connections
when the RPC queue is full.
> The problem is that when there is a surge of requests, we would stop
> accepting connection and clients will get a connection failed (a change from old behavior).
> Instead if you continue to accept connections it is likely that the surge will be over
shortly and
> clients will get served. Of course if the surge lasts a long time the OS will stop accepting
> and clients will fail and there not much one can do (except raise the os limit).
> I propose that we continue accepting connections, but not read from
> connections when the RPC queue is full. (ie undo part of 2910 work back to the old behavior).

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message