hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hairong Kuang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3109) RPC should accepted connections even when rpc queue is full (ie undo part of HADOOP-2910)
Date Sat, 29 Mar 2008 01:32:24 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583261#action_12583261
] 

Hairong Kuang commented on HADOOP-3109:
---------------------------------------

> Wouldn't it be easier to increase the sockets backlog size and remove the connect timeout?
Increasing sockets backlog size might be a good solution. What should be a good backlog size?
The connect timeout is already removed in HADOOP-2910.

> RPC should accepted connections even when rpc queue is full (ie undo part of HADOOP-2910)
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3109
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3109
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Sanjay Radia
>            Assignee: Hairong Kuang
>            Priority: Blocker
>             Fix For: 0.17.0
>
>
> HADOOP-2910 changed HDFS to stop accepting new connections when the rpc queue is full.
It should continue to accept connections and let the OS  deal with limiting connections.
> HADOOP-2910's decision to not read from open sockets when queue is full is exactly right
-  backup on the
> client sockets and they will just wait( especially with HADOOP-2188 that removes client
timeouts).
> However we should continue to  accept connections:
> The OS refuses new connections after a large number of connections are open (this is
configurable parameter). With this patch, we have  new lower limit for # of open connections
when the RPC queue is full.
> The problem is that when there is a surge of requests, we would stop
> accepting connection and clients will get a connection failed (a change from old behavior).
> Instead if you continue to accept connections it is likely that the surge will be over
shortly and
> clients will get served. Of course if the surge lasts a long time the OS will stop accepting
connections
> and clients will fail and there not much one can do (except raise the os limit).
> I propose that we continue accepting connections, but not read from
> connections when the RPC queue is full. (ie undo part of 2910 work back to the old behavior).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message