hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Isaacson <...@cloudera.com>
Subject Re: cluster set-up / a few quick questions
Date Fri, 26 Oct 2012 18:20:58 GMT
On Fri, Oct 26, 2012 at 9:40 AM, Kartashov, Andy <Andy.Kartashov@mpac.ca> wrote:
> Gents,

We're not all male here. :)  I prefer "Hadoopers" or "hi all,".

> 1.
> - do you put Master's node <hostname> under fs.default.name in core-site.xml on
the slave machines or slaves' hostnames?

Master.  I have a 4-node cluster, named foo1 - foo4. My
fs.default.name is hdfs://foo1.domain.com.

> - do you need to run "sudo -u hdfs hadoop namenode -format" and create /tmp /var folders
on the HDFS of the slave machines that will be running only DN and TT or not? Do you still
need to create hadoop/dfs/name folder on the slaves?

(The following is the simple answer, for non-HA non-federated HDFS.
You'll want to get the simple example working before trying the
complicated ones.)

No. A cluster has one namenode, running on the machine known as the
master, and the admin must "hadoop namenode -format" on that machine

In my example, I ran "hadoop namenode -format" on foo1.

> 2.
> In hdfs-site.xml for dfs.name.dir & dfs.data.dir properties  we specify  /hadoop/dfs/name
/hadoop/dfs/data  being  local linux NFS directories by running command "mkdir -p /hadoop/dfs/data"
> but mapred.system.dir  property is to point to HDFS and not NFS  since we are running
"sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system"??
> If so and since it is exactly the same format  /far/boo/baz how does hadoop know which
directory is local on NFS or HDFS?

This is very confusing, to be sure!  There are a few places where
paths are implicitly known to be on HDFS rather than a Linux
filesystem path. mapred.system.dir is one of those. This does mean
that given a string that starts with "/tmp/" you can't necessarily
know whether it's a Linux path or a HDFS path without looking at the
larger context.

In the case of mapred.system.dir, the docs are the place to check;
according to cluster_setup.html, mapred.system.dir is "Path on the
HDFS where where the Map/Reduce framework stores system files".


Hope this helps,

View raw message