hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yoram Arnon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-312) Connections should not be cached
Date Tue, 08 Aug 2006 00:32:15 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-312?page=comments#action_12426378 ] 
Yoram Arnon commented on HADOOP-312:

the current model, where a tasktracker sends a heartbeat every few seconds, and is declared
dead only after 10 minutes is fine, as long as the rate of loss of heartbeats is low.
I'm only concerned, since the size of the accept queue is fairly small by default (~5), that
 the probability of missing a new connection will be significantly greater than the probability
of losing a message in an open tcp stream (very low). Increasing the size of the queue, or
implemeting some form of retries could help.

There's a tradeoff here between the overhead of many 'accept's per second and the overhead
of 'select'ing on many sockets, many of which are idle. Let's compare performance with and
without connection caching, and see where we get more lost heartbeats, and better jobtracker

> Connections should not be cached
> --------------------------------
>                 Key: HADOOP-312
>                 URL: http://issues.apache.org/jira/browse/HADOOP-312
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: ipc
>            Reporter: Devaraj Das
>         Assigned To: Devaraj Das
>         Attachments: no_connection_caching.patch, no_connection_caching.patch
> Servers and clients (client include datanodes, tasktrackers, DFSClients & tasks)
should not cache connections or maybe cache them for very short periods of time. Clients should
set up & tear down connections to the servers everytime they need to contact the servers
(including the heartbeats). If connection is cached, then reuse the existing connection for
a few subsequent transactions until the connection expires. The heartbeat interval should
be more so that many more clients (order of  tens of thousands) can be accomodated within
1 heartbeat interval.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message