hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "M. C. Srivas" <mcsri...@gmail.com>
Subject Re: Hadoop HA
Date Wed, 23 May 2012 04:45:57 GMT
On Tue, May 22, 2012 at 12:08 AM, Martinus Martinus

> Hi Todd,
> Thanks for your answer. Is that will have the same capability as the
> commercial M5 of MapR : http://www.mapr.com/products/why-mapr ?
> Thanks.

Hi Martinus,   some major differences in HA between MapR's M5 and Apache

1. with M5, any node become master at any time. It is a fully active-active
system. You can get create a fully bomb-proof cluster, such that in a
20-node cluster, you can configure to survive even if 19 of the 20 nodes
are lost. With Apache, it is a 1-1 active-passive system.

2. M5 does not require a NFS filer in the backend. Apache Hadoop requires a
Netapp or similar NFS filer to assist in saving the NN data, even in its HA
configuration.  Note that for true HA, the Netapp or similar also will need
to be HA.

3. M5 has full HA for the Job-Tracker as well.

Of course, HA is only a small part of the total business continuity story.
 Full recovery in the face of any kind of failures is critical:

With M5:

-  If there is a complete cluster crash and reboot (eg, a full
power-failure of the entire cluster), M5 will recover in 5-10 minutes, and
submitted jobs will resume from where they were.

- with snapshots, if you upgrade your software and it corrupts data, M5
provides snapshots to help you recover. The number of times I've seen
someone running  "hadoop fs -rmr /" accidentally and asking for help on
this mailing list is beyond counting. With M5, it is completely recoverable

- full disaster-recovery across clusters by mirroring.

Hope that clarifies some of the differences.

> On Tue, May 22, 2012 at 2:26 PM, Todd Lipcon <todd@cloudera.com> wrote:
>> Hi Martinus,
>> Hadoop HA is available in Hadoop 2.0.0. This release is currently
>> being voted on in the community.
>> You can read more here:
>> http://www.cloudera.com/blog/2012/03/high-availability-for-the-hadoop-distributed-file-system-hdfs/
>> -Todd
>> On Mon, May 21, 2012 at 11:24 PM, Martinus Martinus
>> <martinus787@gmail.com> wrote:
>> > Hi,
>> >
>> > Is there any hadoop HA distribution out there?
>> >
>> > Thanks.
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera

View raw message