hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "LargeClusterTips" by ArpitAgarwal
Date Wed, 20 Jan 2016 06:15:24 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "LargeClusterTips" page has been changed by ArpitAgarwal:

Link to NameNode HA

   * Once you are on the private LAN, turn off all firewalls on the machines, as it only creates
connectivity problems.
   * Use LDAP or similar to manage user accounts.
   * Only put the slaves file on your namenode and secondary namenode to prevent confusion.
   * Use RPMs to install the Hadoop binaries. [[Cloudera]] provide some RPMs for this, and
a web site to generate configuration RPM files.
   * Use kickstart or similar to bring up the machines. 
   * If you are trying to configure the machines one by one, step away from the keyboard.
That is not the way to manage a cluster.
@@ -34, +33 @@

  == NameNode Health ==
  The NameNode is a SPOF. When it goes offline, the cluster goes down. If it loses its data,
the filesystem is gone. Value it.
-  * Have a secondary name node! When the BackupNode replaces this, have a BackupNode!
+  * [[Configure NameNode High-Availability|https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html]]
   * Never let its disks fill up.
   * Consider RAID storage here. If not, set it to save its data to two independent drives,
ideally on separate controllers (just in case the controller decides to play up)
   * Set the NN up to save one copy of all its data to a remote machine (NFS?), so even if
the NN goes down, you can bring up a new machine with the same hostname for everything else
to bind to.

View raw message