Return-Path: Delivered-To: apmail-hadoop-common-dev-archive@www.apache.org Received: (qmail 57533 invoked from network); 16 Sep 2009 18:49:45 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 16 Sep 2009 18:49:45 -0000 Received: (qmail 56311 invoked by uid 500); 16 Sep 2009 18:49:44 -0000 Delivered-To: apmail-hadoop-common-dev-archive@hadoop.apache.org Received: (qmail 56235 invoked by uid 500); 16 Sep 2009 18:49:44 -0000 Mailing-List: contact common-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-dev@hadoop.apache.org Delivered-To: mailing list common-dev@hadoop.apache.org Received: (qmail 56225 invoked by uid 99); 16 Sep 2009 18:49:44 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Sep 2009 18:49:44 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of bxin33@gmail.com designates 209.85.216.173 as permitted sender) Received: from [209.85.216.173] (HELO mail-px0-f173.google.com) (209.85.216.173) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Sep 2009 18:49:36 +0000 Received: by pxi3 with SMTP id 3so4326827pxi.31 for ; Wed, 16 Sep 2009 11:49:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=GVxuT5qNgQ0rdlnwME3emCmOuZv1RvSdjsFlKcomI+g=; b=gYQgx2NqgKyhaqfnRugQ6IuEcjI2hhs1zhdDVwofQRwmGwWO7KP+n1Bap9qOQJw1W5 ac85krffrvuORKl5tk3iBOKqXerQg62Moc41juaqOhsr6NT+E6xciI3iBBSRQa2uWk++ SOUlqxp79K9aUmQYHYJ1H4v9vHKJPJcSPtGgI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=W+hcF9mOgwVAtdVR2cWmVxrf+f83Px1/jmlBS0AvErgpqNwoWwXVWBsrT9u+SLvvVi uXXcBIsnhTCE70xqIQkCksVCMAJhLe0eGUqp2Cyz+XX3oNkp6IrKv+y7N8UKwUtoc/9+ ncsnYXutvUgcalcGptg1/97jN54ufMXEcPlXI= MIME-Version: 1.0 Received: by 10.142.75.17 with SMTP id x17mr707066wfa.154.1253126955979; Wed, 16 Sep 2009 11:49:15 -0700 (PDT) In-Reply-To: <83744B4F-942D-44B8-8755-5DE631A9FE96@apache.org> References: <841f77cc0908191214t7f2ebcb1pc9ff45eb7b5a0af3@mail.gmail.com> <83744B4F-942D-44B8-8755-5DE631A9FE96@apache.org> Date: Wed, 16 Sep 2009 14:49:15 -0400 Message-ID: <841f77cc0909161149h249bdc2ctfc254209b17e47c2@mail.gmail.com> Subject: Re: different modes of inter process communication From: "B. X." To: common-dev@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org On Wed, Aug 19, 2009 at 7:10 PM, Owen O'Malley wrote: Thank you both for clearing it up. I have another related question: my understanding is that basic heartbeat mechanism are used to keep different roles (namenode, datanode, tasktracker etc) aware of each other, but I am not able to observe this in the log. For example, if I use the sigstop/sigcont mechanism to stop the namenode jvm process for 30 seconds and then continue, I don't observe any extra communications due to supposedly missed heartbeat. (I checked the dfs.heartbeat.interval is set to 3 seconds). Rather, what I saw is all roles seem to stop in unison for 30 seconds (by the fact that no log events in the same time window). I would appreciate some pointers on how heartbeats are used and configured. -Bin