Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A006DF5B5 for ; Sun, 14 Dec 2014 22:30:03 +0000 (UTC) Received: (qmail 52414 invoked by uid 500); 14 Dec 2014 22:29:57 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 52285 invoked by uid 500); 14 Dec 2014 22:29:57 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 52275 invoked by uid 99); 14 Dec 2014 22:29:56 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 14 Dec 2014 22:29:56 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of daemeonr@gmail.com designates 209.85.213.172 as permitted sender) Received: from [209.85.213.172] (HELO mail-ig0-f172.google.com) (209.85.213.172) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 14 Dec 2014 22:29:31 +0000 Received: by mail-ig0-f172.google.com with SMTP id hl2so4080959igb.11 for ; Sun, 14 Dec 2014 14:28:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=JtZyxI6OSfQqOzR7YmSM1HJTVsue+6GCwTwMqlHGYeI=; b=goxl3n8G271Rxje6hS77UF2VSuQkLA9uAVLY/np+jsG1zVIrYNnlXP4jzk/kQYdhkf 7m3moBWo5uAj8wWS/bi1w/2Vzlc9oM322mfe7vD2rDpiQmaY2Frwi9vg6dO3WXO2t6zx vF4KBVmr/J46MjOvfG1QOMy3ffWoBaVSpkRPLKmPq5hXfxAMYP2SSdf+Gi3oZp5lChkx KECfNdzCdCtOmaxgvDAyazD3K+uu4rRvreimXKoLmiMrJME/geWbwC8Mg1Z4f33x8UO/ kvWB8K9eEL5CP8KEPEcc4ws7iHQstmjJmupgIrmne5GHcEYFxtSh0Psi+/P8h0TfbS4l cQHg== X-Received: by 10.43.66.9 with SMTP id xo9mr25154333icb.67.1418596124898; Sun, 14 Dec 2014 14:28:44 -0800 (PST) MIME-Version: 1.0 Received: by 10.50.25.200 with HTTP; Sun, 14 Dec 2014 14:28:14 -0800 (PST) In-Reply-To: References: From: daemeon reiydelle Date: Sun, 14 Dec 2014 14:28:14 -0800 Message-ID: Subject: Re: What happens to data nodes when name node has failed for long time? To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=bcaec51d227a1abaa3050a34a688 X-Virus-Checked: Checked by ClamAV on apache.org --bcaec51d227a1abaa3050a34a688 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable I found the terminology of primary and secondary to be a bit confusing in describing operation after a failure scenario. Perhaps it is helpful to think that the Hadoop instance is guided to select a node as primary for normal operation. If that node fails, then the backup becomes the new primary. In analyzing traffic it appears that the restored node does not become primary again until the whole instance restarts. I myself would welcome clarification on this observed behavior. *.......* *=E2=80=9CLife should not be a journey to the grave with the intention of a= rriving safely in apretty and well preserved body, but rather to skid in broadside in a cloud of smoke,thoroughly used up, totally worn out, and loudly proclaiming =E2=80=9CWow! What a Ride!=E2=80=9D - Hunter ThompsonDaemeon C.= M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* On Fri, Dec 12, 2014 at 7:56 AM, Rich Haase wrote: > The remaining cluster services will continue to run. That way when the > namenode (or other failed processes) is restored the cluster will resume > healthy operation. This is part of hadoop=E2=80=99s ability to handle ne= twork > partition events. > > *Rich Haase* | Sr. Software Engineer | Pandora > m 303.887.1146 | rhaase@pandora.com > > From: Chandrashekhar Kotekar > Reply-To: "user@hadoop.apache.org" > Date: Friday, December 12, 2014 at 3:57 AM > To: "user@hadoop.apache.org" > Subject: What happens to data nodes when name node has failed for long > time? > > Hi, > > What happens if name node has crashed for more than one hour but > secondary name node, all the data nodes, job tracker, task trackers are > running fine? Do those daemon services also automatically shutdown after > some time? Or those services keep running hoping for namenode to come bac= k? > > Regards, > Chandrash3khar Kotekar > Mobile - +91 8600011455 > --bcaec51d227a1abaa3050a34a688 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
I found the terminology of primary and= secondary to be a bit confusing in describing operation after a failure sc= enario. Perhaps it is helpful to think that the Hadoop instance is guided t= o select a node as primary for normal operation. If that node fails, then t= he backup becomes the new primary. In analyzing traffic it appears that the= restored node does not become primary again until the whole instance resta= rts. I myself would welcome clarification on this observed behavior.


......= .
=E2=80=9CLife should not be a journey to the= grave with the intention of arriving safely in a
pretty and well preserved body, but rather to skid in broadside in a cloud of smoke,
thoroughly used up, totally worn out, and loudly proclaiming =E2=80=9CWow! What a Ride!=E2=80=9D

- = Hunter Thompson

Daemeon C.M. Reiydelle
USA (+1) 415.501.0198
L= ondon (+44) (0) 20 8144 9872

On Fri, Dec 12, 2014 at 7:56 AM, Rich Haase = <rhaase@pandora.com> wrote:
The remaining cluster services will continue to run.=C2=A0 That way wh= en the namenode (or other failed processes) is restored the cluster will re= sume healthy operation.=C2=A0 This is part of hadoop=E2=80=99s ability to h= andle network partition events.
=C2=A0
Rich Haase=C2=A0| Sr. = Software Engineer | Pandora

From: Chandrashekhar Kotekar <shekhar.koteka= r@gmail.com>
Reply-To: "user@hadoop.apache.org" &= lt;user@hadoop.= apache.org>
Date: Friday, December 12, 2014 at = 3:57 AM
To: "user@hadoop.apache.org" <user@hadoop.apache= .org>
Subject: What happens to data nodes= when name node has failed for long time?

Hi,

What happens if name node has crashed f= or more than one hour but secondary name node, all the data nodes, job trac= ker, task trackers are running fine? Do those daemon services also automati= cally shutdown after some time? Or those services keep running hoping for namenode to come back?

Regards,
Chandrash3khar Kotekar
Mobile - +91 8600011455

--bcaec51d227a1abaa3050a34a688--