hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "LargeClusterTips" by SteveLoughran
Date Wed, 20 May 2009 13:59:35 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by SteveLoughran:
http://wiki.apache.org/hadoop/LargeClusterTips

The comment on the change is:
More big cluster tips

------------------------------------------------------------------------------
  Below are tips for managing large clusters.
  
+  * Have a good sysadmin if you're not one yourself.
   * Take a look at a presentation done by Allen Wittenauer from Yahoo!: http://tinyurl.com/5foamm
+  * Have the LAN closed off to untrusted users. This simplifies security.
+  * Use LDAP or similar to manage user accounts.
-  * Only put the slaves file on your namenode and secondary namenode to prevent confusion
+  * Only put the slaves file on your namenode and secondary namenode to prevent confusion.
+  * Have identical hardware on all machines in the cluster, eliminating the need to have
different
+    configuration options (task slots, data directory locations, etc)
+  * Use RPMs to install the Hadoop binaries. Self:Cloudera provide some RPMs for this, and
a web site to generate configuration RPM files.
+  * Use kickstart or similar to bring up the machines. 
-  * Use a system configuration management package to keep Hadoop's source consistent across
all nodes.  Some example packages are bcfg2, smartfrog, puppet, cfengine, etc.
+  * Consider a system configuration management package to keep Hadoop's source and configuration
consistent across all nodes.  Some example packages are bcfg2, smartfrog, puppet, cfengine,
etc. 
-  * Have a good sysadmin if you're not one
+  * If you are trying to configure the machines one by one, step away from the keyboard.
That is not the way to manage a cluster.
  
  See the Self:AmazonEC2 and AmazonS3 pages for tips on managing clusters built on EC2 and
S3.
  
- Other good documentation: http://wiki.smartfrog.org/wiki/display/sf/Patterns+of+Hadoop+Deployment
+ Other good documentation: [http://wiki.smartfrog.org/wiki/display/sf/Patterns+of+Hadoop+Deployment
Patterns of Hadoop Deployment]
  

Mime
View raw message