hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From surendra lilhore <surendra.lilh...@huawei.com>
Subject RE: Journal nodes , QJM requirement
Date Tue, 28 Feb 2017 07:16:17 GMT
Hi Amit,

1. Shared storage is used instead of direct write to standby, to allow cluster to be functional,
even when the standby is not available. Shared storage is distributed, it will be functional
even if one of the node (standby) fails. So it supports uninterrupted functionality for the

2. HDFS used shared storage or journal node to avoiding the “split-brain” syndrome, where
multiple namenodes think they’re in charge of the cluster. JournalNodes node will allow
only one active namenode to write the edits logs.
For more info you can check the HDFS document https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html


From: Amit Kabra [mailto:amitkabraiiit@gmail.com]
Sent: 27 February 2017 10:29
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: Journal nodes , QJM requirement

Hi Hadoop Users,

I have one question, didn't get information on internet.

Why hadoop needs journaling system. In order to sync Active / Standby NN, instead of using
Journal node or any shared system, can't it do master-slave or multi master replication where
for any write master will write to other master/slave as well and only once replication is
done at other sites will commit / accept the write ?

One reason I could think is journal node writes data from NN in append only mode which might
make it faster as compared to writing to slave / another master for replication but I am not

Any pointers ?

Amit Kabra.
View raw message