Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: common-user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of stas.oskin@gmail.com designates
 209.85.220.222 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type;
        b=owhz4J3hLgRAD2m2RtVVMwlDMobXA6t7YHCTPsCv01sxWkH1fRBahEdkIy5k+nUuqH
         2YRJeC4zYFy93KFrqh3BOQd7QgJPvVbeTvX7w3a2lywMEKAPirqtwNAlK859RNSW2it3
         l2U+iPgT+5WW5pmdqxlL23RR9YackORIpbhtQ=
MIME-Version: 1.0
In-Reply-To: <4AC5CCC0.2020000@apache.org>
References: <77938bc20910011053k7d53a14vc7558098375f2df0@mail.gmail.com>
	 <45f85f70910011202n5da02482ke55f829a9e8b166c@mail.gmail.com>
	 <77938bc20910011509y1705cc56q19545289fea42f0@mail.gmail.com>
	 <4AC5CCC0.2020000@apache.org>
Date: Fri, 2 Oct 2009 12:41:57 +0200
Message-ID: <77938bc20910020341v6472233cp6280279af03ea2f8@mail.gmail.com>
Subject: Re: NameNode high availability
From: Stas Oskin <stas.oskin@gmail.com>
To: common-user@hadoop.apache.org
Content-Type: text/plain; charset=ISO-8859-1

Hi.

The HA service (heartbeat) is running on Dom0, and when the primary
node is down, it basically just starts the VM on the other node. So
there not supposed to be any time issues.

Can you explain a bit more about your approach, how to automate it for example?

Thanks.

On 10/2/09, Steve Loughran <stevel@apache.org> wrote:
> Stas Oskin wrote:
>> Hi.
>>
>> Could you share the way in which it didn't quite work? Would be valuable
>>> information for the community.
>>>
>>
>> The idea is to have a Xen machine dedicated to NN, and maybe to SNN, which
>> would be running over DRBD, as described here:
>> http://www.drbd.org/users-guide/ch-xen.html
>>
>> The VM will be monitored by heart-beat, which would restart it on another
>> node when it fails.
>>
>> I wanted to go that way as I thought it's perfect in case of small
>> cluster,
>> as then the node can be re-used for other tasks.
>> Once the cluster grows reasonably, the VM could be migrated to dedicated
>> machine in live fashion - with minimum downtime.
>>
>> Problem is, that it didn't work as expected. The Xen over DRBD is just not
>> reliable, as described. The most basic operation of live domain migration
>> works only in 50% of cases. Most often the domain migration leaves the
>> DRBD
>> in read-only status, meaning the domain can't be cleanly shut down - only
>> killed. This often leads in turn to NN meta-data corruption.
>
> It's probably a quirk of virtualisation, all those clocks and things,
> causes trouble for any HA protocol running round the cluster. I would
> not blame Xen, as VMWare and virtualbox are also tricky.
>
> As you have a virtual infrastructure, why not have an image of the 1ary
> NN, ready to bring up on demand when the NN goes down, pointed at a copy
> of the NN datasets?
>

-- 
Sent from my mobile device