hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "HadoopIsNot" by SteveLoughran
Date Wed, 25 Nov 2009 17:09:04 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "HadoopIsNot" page has been changed by SteveLoughran.
The comment on this change is: More details on what you need to know before you get started.
http://wiki.apache.org/hadoop/HadoopIsNot?action=diff&rev1=4&rev2=5

--------------------------------------------------

  
  == Hadoop clusters are not a place to learn Unix/Linux system administration ==
  
- You need to know your way round a Unix/Linux system. How to install it, what the various
files in /etc/ are for, how to set up networking, what is a good hosts table, debug DNS problems,
why to keep logs on a separate disk from the root disk, etc. If you cannot look after a single
machine, you aren't going to be able to handle a cluster of 80 of them. That said, don't try
maintaining those 80+ boxes using the same technique of hand-editing files lile [[/etc/hosts]],
because it doesn't scale.
+ You need to know your way round a Unix/Linux system. How to install it, what the various
files in /etc/ are for, how to set up networking, what is a good hosts table, how to debug
DNS problems, why to keep logs on a separate disk from the root disk, etc. If you cannot look
after a single machine, you aren't going to be able to handle a cluster of 80 of them. That
said, don't try maintaining those 80+ boxes using the same technique of hand-editing files
like [[/etc/hosts]], because it doesn't scale.
+ 
+ Things you need to know
+ 
+  * SSH, what it is, how to set up authorized_keys, how to use ssh and scp
+  * ifconfig, nslookup and other network config/diagnostics tools
+  * How your platform keeps itself up to date
+  * What the various log files your machine generates, and what they mean
+  * How to set up native filesystems and mount them
+ 
+ This is important. If you don't know these, you are out of your depth and should not start
installing Hadoop until you have the basics of a couple of linux systems up and running, letting
you ssh in to each of them without entering a password, know each other's hostname and such
like. The Hadoop installation documents all assume you can do these things, and aren't going
to bother explaining about them.
  
  == Hadoop Filesystem is not a substitute for a High Availability SAN-hosted FS ==
  

Mime
View raw message