Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 98112 invoked from network); 17 Mar 2008 22:48:13 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 17 Mar 2008 22:48:13 -0000 Received: (qmail 35906 invoked by uid 500); 17 Mar 2008 22:48:09 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 35884 invoked by uid 500); 17 Mar 2008 22:48:09 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 35875 invoked by uid 99); 17 Mar 2008 22:48:09 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Mar 2008 15:48:09 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Mar 2008 22:47:29 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 69E48234C0A5 for ; Mon, 17 Mar 2008 15:46:24 -0700 (PDT) Message-ID: <2124988205.1205793984432.JavaMail.jira@brutus> Date: Mon, 17 Mar 2008 15:46:24 -0700 (PDT) From: "Raghu Angadi (JIRA)" To: core-dev@hadoop.apache.org Subject: [jira] Commented: (HADOOP-2188) RPC should send a ping rather than use client timeouts MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579652#action_12579652 ] Raghu Angadi commented on HADOOP-2188: -------------------------------------- I am looking at the patch too.. though still not done. Couple of queries : # Why does the patch remove purging the connections on server if server is not able to write to the socket? # minor : in one place have '{{do { wait(); } while (!cond);}}' its unconventional in the sense, normally the right thing to do is to check for condition before waiting. > RPC should send a ping rather than use client timeouts > ------------------------------------------------------ > > Key: HADOOP-2188 > URL: https://issues.apache.org/jira/browse/HADOOP-2188 > Project: Hadoop Core > Issue Type: Improvement > Components: dfs, ipc > Reporter: Owen O'Malley > Assignee: Hairong Kuang > Attachments: ipc-timeout.patch, ipc-timeout1.patch, ipc-timeout2.patch, ipc-timeout3.patch, rpc-to.patch > > > Current RPC (really IPC) relies on client side timeouts to find "dead" sockets. I propose that we have a thread that once a minute (if the connection has been idle) writes a "ping" message to the socket. The client can detect a dead socket by the resulting error on the write, so no client side timeout is required. Also note that the ipc server does not need to respond to the ping, just discard it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.