hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kartashov, Andy" <Andy.Kartas...@mpac.ca>
Subject cluster set-up / directory structure or something that I was originally confused about
Date Tue, 06 Nov 2012 15:22:38 GMT
Hadoopers,

Last month I asked question 2 and this is what I have recently learned.

If you decide to overwrite Hadoop default directories (recommended) this is what to keep in
mind:

You create directories:

A. On Local Linux FS, using $ mkdir:
***************************

1. What:
/home/hadoop/dfs/name (required if you are running  NameNode on this node)
/home/hadoop/dfs/data (required if you are running DataNode)
 /home/hadoop/dfs/namesecondary (required if you are running SNN)

Where to specify property:
conf/hdfs-site.xml

NOTE: if you start NN/DN/SNN prior to creating those directories and modifying hdfs-site.xml
file, a /tmp/..-hdfs/dfs/[name/data/namesecondary]  will be created for you on your local
Linux file system.

2. What:
/home/mapred/local (required if you are running TT on this node)

Where to specify property :
conf/mapred-site.xml
NOTE: if you don't  create  this directory and modifying mapred-site.xml file, a /tmp/*/mapred
/local  will be created for you on your local Linux file system when you start TT Daemon.

B. On HDFS, using $sudo -u hdfs hadop fs -mkdir
*************************************
1. What:
/tmp
/var/...... (depending on your MapReduce)
/user/<user>

2. What:
/home/hadoop/system (requires for the JT)
Where to specify property:
conf/mapred-site.xml
NOTE: if you start Job-Tracker Daemon on your NN before creating above directory. A default
directory will be created inside your HDFS - /tmp/hadop/mapred/system

Each folder requires proper ownership. DFS  needs hdfs:hadoop / MapReduce needs mapred:hadoop

Feel free to correct me if I am wrong.

Rds,
AK47


-----Original Message-----
From: Kartashov, Andy
Sent: Friday, October 26, 2012 12:40 PM
To: user@hadoop.apache.org
Subject: cluster set-up / a few quick questions

Gents,

1.
- do you put Master's node <hostname> under fs.default.name in core-site.xml on the
slave machines or slaves' hostnames?
- do you need to run "sudo -u hdfs hadoop namenode -format" and create /tmp /var folders on
the HDFS of the slave machines that will be running only DN and TT or not? Do you still need
to create hadoop/dfs/name folder on the slaves?

2.
In hdfs-site.xml for dfs.name.dir & dfs.data.dir properties  we specify  /hadoop/dfs/name
/hadoop/dfs/data  being  local linux NFS directories by running command "mkdir -p /hadoop/dfs/data"
but mapred.system.dir  property is to point to HDFS and not NFS  since we are running "sudo
-u hdfs hadoop fs -mkdir /tmp/mapred/system"??
If so and since it is exactly the same format  /far/boo/baz how does hadoop know which directory
is local on NFS or HDFS?


Would you please kindly reconfirm.

Cheers,
AK47
NOTICE: This e-mail message and any attachments are confidential, subject to copyright and
may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not
the intended recipient, please delete and contact the sender immediately. Please consider
the environment before printing this e-mail. AVIS : le présent courriel et toute pièce jointe
qui l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent être couverts
par le secret professionnel. Toute utilisation, copie ou divulgation non autorisée est interdite.
Si vous n'êtes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement
l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent courriel
Mime
View raw message