Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of maheswara@huawei.com designates
 119.145.14.65 as permitted sender)
From: Uma Maheswara Rao G <maheswara@huawei.com>
To: "user@hadoop.apache.org" <user@hadoop.apache.org>
Subject: RE: checkpointnode backupnode hdfs HA
Thread-Topic: checkpointnode backupnode hdfs HA
Thread-Index: AQHNe4dWo53bysyNwEyeiBIOJVAETZdcGSol
Date: Thu, 16 Aug 2012 08:37:40 +0000
Message-ID: 
 <1542FA4EE20C5048A5C2A3663BED2A6B30A60F6C@szxeml531-mbs.china.huawei.com>
References: <502CAB3F.6020801@ngdata.com>
In-Reply-To: <502CAB3F.6020801@ngdata.com>
Accept-Language: en-US, zh-CN
Content-Language: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0

Hi Jan,

Don't confuse with the backupnode/checkpoint nodes here.

The new HA architecture mainly targetted to build HA with Namenode states.
1) Active Namenode
2) Standby Namenode

When you start NN, they both will start in standby mode bydefault.

then you can switch one NN to active state by giving ha admin commands or b=
y configuring ZKFC( auto failover) process(not release officially yet).
So, the NN state will start required services accordingly.

This is almost like a new implementation for StandbyNode checkpointing proc=
ess.

Active NN will write edits to local dirs and shared NN dirs. Standby node w=
ill keep tail the edits from Shared NN dirs.

Coming to this Shared storage part:
  Currently there are 3 options.=20
  =20
   1) NFS filers ( mey need to buy external devices)
  =20
   2) BookKeeper ( Its a subproject of open source ZooKeeper). This is main=
ly inspired by NN. This is high performance write ahead logging system. and=
 also it can scale to more nodes depending on usage dynamically.
       Now the integration with BookKeeper already available and we are run=
ning the some clusters with that. HDFS-3399
      =20
   3) Other option is Quorum based approach, this is under development. Thi=
s is mainly aimed to develop shared storage nodes inside HDFS itself
      and can make use of proven RPC protocols for unified security mechani=
sms and use the proven edits storage layers. HDFS-3077.


I hope, this will give more idea on current HA in community.

Regards,
Uma

________________________________________
From: Jan Van Besien [janvb@ngdata.com]
Sent: Thursday, August 16, 2012 1:41 PM
To: user@hadoop.apache.org
Subject: checkpointnode backupnode hdfs HA

I am a bit confused about the different options for namenode high
availability (or something along those lines) in CDH4 (hadoop-2.0.0).

I understand that the secondary namenode is deprecated, and that there
are two options to replace it: checkpoint or backup namenodes. Both are
well explained in the documentation, but the confusion begins when
reading about "HDFS High Availability", for example here:
http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-yarn/hadoop-yarn-s=
ite/HDFSHighAvailability.html

Is the topic "HDFS High Availability" as described there (using shared
storage) related to checkpoint/backup nodes. If so, in what way?

If I read about backup nodes, it also seems to be aimed at high
availability. From what I understood, the current implementation doesn't
provide (warm) fail-over yet, but this is planned. So starting to
replace secondary namenodes now with backup namenodes sounds like a
future proof idea?

thanks,
Jan=