hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: HADOOP-4539 question
Date Wed, 12 Aug 2009 19:27:06 GMT
On Wed, Aug 12, 2009 at 12:06 PM, Konstantin Shvachko <shv@yahoo-inc.com>wrote:

> Stas,
> There is no HA solution currently for Hadoop.
> You can do things like Cloudera describes.
> Their solution works with 2 real name-nodes.
> No Backup node involved.
> As for Backup node, I don't really understand Todd's comment
> but the fact is that Backup node (BN) is not a standby
> node. The failover procedure is not implemented for BN,
> so neither clients nor data-node don't fail-over anywhere
> when the main name-node (NN) dies, they don't have a clue.

Gotcha - I thought the long term goal for the BN was to eventually have it
work as a "warm standby" that could convert into a NN without restart.

My mistake


> The purpose of the BN is
> 1) to keep an up-to-date image of the namespace in memory.
> This does not include block locations.
> BN does not know where file blocks are.
> 2) to make periodic checkpoints, like SecondaryNameNode did,
> but more efficiently, since BN does not need to load image
> and edits from NN, its namespace is already up-to-date.
> There is provision to transform BN to a real standby node,
> with failover, but it has not been implemented yet.
> Hope this clarifies things.
> Thanks,
> --Konstantin
> Todd Lipcon wrote:
>> On Wed, Aug 12, 2009 at 3:42 AM, Stas Oskin <stas.oskin@gmail.com> wrote:
>>  Hi.
>>>  You can also use a utility like Linux-HA (aka heartbeat) to handle IP
>>>> address failover. It will even send gratuitous ARPs to make sure to get
>>> the
>>>> new mac address registered after a failover. Check out this blog for
>>>> info
>>>> about a setup like this:
>>>> http://www.cloudera.com/blog/2009/07/22/hadoop-ha-configuration/
>>>> Hope that helps
>>>>  Thanks, exactly what I looked for :).
>>>  I presume that with the coming BB node, there won't be need for DRBD, am
>>> I
>>> correct?
>> I haven't followed that development closely, but I believe that's the
>> case.
>> The BackupNode will stream the FSEditLog writes as they occur while
>> replaying them into its own FSNamesystem. Then during a failover a real
>> NameNode starts on that FSNamesystem "ready to go". As for how the
>> BackupNode keeps track of block locations, I'm not sure - is there a
>> replication stream between BlockManagers too? Or is the cluster in a
>> broken
>> state until all of the DNs have processed new block reports?
>> -Todd

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message