Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 011D9101CE for ; Tue, 3 Sep 2013 21:19:20 +0000 (UTC) Received: (qmail 6143 invoked by uid 500); 3 Sep 2013 21:19:15 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 6035 invoked by uid 500); 3 Sep 2013 21:19:15 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 6028 invoked by uid 99); 3 Sep 2013 21:19:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Sep 2013 21:19:15 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of dontariq@gmail.com designates 209.85.212.50 as permitted sender) Received: from [209.85.212.50] (HELO mail-vb0-f50.google.com) (209.85.212.50) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Sep 2013 21:19:11 +0000 Received: by mail-vb0-f50.google.com with SMTP id x14so4560829vbb.9 for ; Tue, 03 Sep 2013 14:18:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=Jl8NZoQauq1CBwVjmB4Dwvz/Cupfp2B7OyC714vo17g=; b=DnLIZu5VImJLpejDIrGBqFq2M6l5YI8tAjQduZDEHCTKiK/ULdnXRc1YgqbeLbQ4in DWnJjaxRhn1vdCeODy+P1BWUs+ZwQVTvd41mvuP5YVXT/dJZNGnQfPsZ8a6ABSAhKhyw AfKKnL9KO6tHUt1LtjaQ6op8rmqXL9UGIg3y34G28/RaK07QaHpKTuRfcFeiVLFsJ/WT S01reNb+fXXuCm3rZs/UJxZPLLhPTOoe+3ebeNTCEet86XX3kYip4437IeO7n6Bc4NXh FWnAb+F35UYbEo2pT++uPIXDSGTD03fA77uJxcgpnKp9wIsj+f3nDokMo0QLumTMkfiZ glmw== X-Received: by 10.221.40.10 with SMTP id to10mr9809404vcb.22.1378243130217; Tue, 03 Sep 2013 14:18:50 -0700 (PDT) MIME-Version: 1.0 Received: by 10.58.18.205 with HTTP; Tue, 3 Sep 2013 14:18:10 -0700 (PDT) In-Reply-To: References: <5bdec4adc10df6102f8a68a72f9b3873@admin.virtall.com> From: Mohammad Tariq Date: Wed, 4 Sep 2013 02:48:10 +0530 Message-ID: Subject: Re: so the master just died... now what? To: "user@hadoop.apache.org" Content-Type: multipart/alternative; boundary=001a1133757630da6104e5813c08 X-Virus-Checked: Checked by ClamAV on apache.org --001a1133757630da6104e5813c08 Content-Type: text/plain; charset=ISO-8859-1 Hello Tomasz, Just to add, Although it says *masters*, the */conf/masters* actually specifies the machine where *SecondaryNameNode* will run. Master daemons run on the machine where you execute the start scripts. If you need to change the master machine, you must make appropriate changes in the *core-site.xml *and *mapred-site.xml* files. Also, update the IP and hostname in the * /etc/hosts* file of your slaves. Warm Regards, Tariq cloudfront.blogspot.com On Wed, Sep 4, 2013 at 2:31 AM, Shahab Yunus wrote: > > Keep in mind that there are 2 flavors of Hadoop: the older one without HA > and the new one with it. Anyway, have you seen the following? > > http://wiki.apache.org/hadoop/NameNodeFailover > http://www.youtube.com/watch?v=Ln1GMkQvP9w > > http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-1.3.0/bk_hdp1-system-admin-guide/content/sysadminguides_ha_chap2_5_5.html > > Regards, > Shahab > > > On Tue, Sep 3, 2013 at 4:54 PM, Tomasz Chmielewski wrote: > >> Just starting with hadoop and hbase, and couldn't find specific answers >> in official documentation (unless I've missed the obvious). >> >> >> Assuming I have three hadoop servers: h1, h2 and h3, with h1 being a >> master+slave - what is a recovery scenario if the master server, h1, >> died and is beyond repair (burned with all disks and got flooded)? >> >> Do I just edit conf/masters file on any of the remaining slaves (say, >> h2), make it a master, and start the NameNode and JobTracker there? >> >> Can anyone point me to relevant documentation? >> >> >> -- >> Tomasz Chmielewski >> http://wpkg.org >> >> > --001a1133757630da6104e5813c08 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hello Tomasz,

Just to add,
Although it says masters, the /conf/masters actu= ally specifies the machine where SecondaryNameNode will run. Master = daemons run on the machine where you execute the start scripts. If you need= to change the master machine, you must make appropriate changes in the = core-site.xml and mapred-site.xml files. Also, update the IP and= hostname =A0in the /etc/hosts file of your slaves.



Warm Regards,
Tariq


On Wed, Sep 4, 2013 at 2:31 AM, Shahab Y= unus <shahab.yunus@gmail.com> wrote:

Keep in mind that there are 2 flavors of Ha= doop: the older one without HA and the =A0new one with it. Anyway, have you= seen the following?


On Tue, Sep 3, 2013 at 4:54 PM, Tomasz Chmielewski <man= goo@wpkg.org> wrote:
Just starting with hadoop and hbase, and cou= ldn't find specific answers
in official documentation (unless I've missed the obvious).


Assuming I have three hadoop servers: h1, h2 and h3, with h1 being a
master+slave - what is a recovery scenario if the master server, h1,
died and is beyond repair (burned with all disks and got flooded)?

Do I just edit conf/masters file on any of the remaining slaves (say,
h2), make it a master, and start the NameNode and JobTracker there?

Can anyone point me to relevant documentation?


--
Tomasz Chmielewski
http://wpkg.org



--001a1133757630da6104e5813c08--