hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Update of "GettingStartedWithHadoop" by mahadevkonar
Date Thu, 24 Aug 2006 02:24:06 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by mahadevkonar:

New page:
= Setting Up a Cluster using Hadoop scripts =
This section explains how to set up a Hadoop cluster running Hadoop DFS and Hadoop Mapreduce.
The startup scripts are in hadoop/bin. The file that contains all the slave nodes that would
join the DFS and map reduce cluster is the slaves file in hadoop/conf. Edit the slaves file
to add nodes to your cluster. You need to edit the slaves file only on the machines you plan
to run the Jobtracker and Namenode. Next edit the file hadoop-env.sh in the hadoop/conf directory.
Make sure JAVA_HOME is set correctly. You can change the other environment variables as per
your requirements. HADOOP_HOME is automatically determined depending on where you run your
hadoop scripts from.

== Starting up DFS ==
=== Formatting the Namenode ===
 * You are required to format the Namenode for your first installation. This is true only
for your first installation. Do not format a Namenode which was already running Hadoop. It
will clear up your DFS. Run bin/hadoop namenode -format on the node you plan to run as the

=== Environment Variables ===
 * The only environment variable that you may need to specify is HADOOP_CONF_DIR. Set this
variable to your configure directory which contains hadoop-site.xml, hadoop-env.sh.
 * You can get rid of this environment variable by specifying the configure directory as a
--config option.
=== Starting up the cluster ===
 * After formatting the namenode run bin/start-dfs.sh on the Namenode. This will bring up
the Namenode and Datanodes on the machines listed in the slaves file mentioned above.
 * Run bin/start-mapred.sh on the machine you plan to run the Jobtracker on. This will bring
up the map reduce cluster with Jobtracker running on the machine you ran the command on and
Tasktrackers running on machines listed in the slaves file. 
 * In case you have not set the HADOOP_CONF_DIR variable, you can use bin/start-mapred.sh
--config configure_directory.
 * Try executing bin/hadoop dfs -lsr / to see if it is working.

=== Stopping the cluster ===
 * You can stop the cluster by running bin/stop-mapred.sh and then bin/stop-dfs.sh. You can
specify the configure directory by using the --config option.

View raw message