hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Kimball <aa...@cloudera.com>
Subject Re: mapred.system.dir
Date Sat, 13 Feb 2010 01:14:29 GMT
To expand on Eric's comment: dfs.data.dir is the local filesystem directory
(or directories) that a particular datanode uses to store its slice of the
HDFS data blocks.

so dfs.data.dir might be "/home/hadoop/data/" on some machine; a bunch of
files with inscrutable names like blk_4546857325993894516 will be stored
there. These "blk" files represent chunks of "real" complete user-accessible
files in HDFS-proper.

mapred.system.dir is a filesystem path like "/system/mapred" which is served
by the HDFS, where files used by MapReduce appear. The purpose of the config
file comment is to let you know that you're free to pick a path name like
"/system/mapred" here even though your local Linux machine doesn't have a
path named "/system"; this HDFS path is in a separate (HDFS-specific)
namespace from "/home", "/etc", "/var" and the other various denizens of
your local machine.

- Aaron

On Fri, Feb 12, 2010 at 6:23 AM, Eric Sammer <eric@lifeless.net> wrote:

> On 2/12/10 8:40 AM, Edson Ramiro wrote:
> > Hi all,
> >
> > I'm setting up a Hadoop Cluster and some doubts have
> >  arisen about hadoop configuration.
> >
> > The Hadoop Cluster Setup [1] says that the mapred.system.dir must
> > be in the HDFS and be accessible from both the server and clients.
> >
> > Where is the HDFS directory? is the dfs.data.dir?
> >
> > should I export by NFS or other protocol the mapred.system.dir to
> > leave it accessible from server and clients?
> >
> > Thanks in advance
> >
> > [1] http://hadoop.apache.org/common/docs/current/cluster_setup.html
> >
> > Edson Ramiro
> >
> Edson:
> An HDFS file system is a distributed global view controlled by the
> namenode. If a file is "in HDFS" all clients and servers that are
> pointed at the namenode will be able to see everything. This means that
> you don't need to do anything special to export or reveal the
> mapred.system.dir; that's what HDFS does. It's worth reading the HDFS
> Architecture paper on the Hadoop site or the Google GFS paper for
> details on how this all works and how it relates to map reduce.
> HTH.
> --
> Eric Sammer
> eric@lifeless.net
> http://esammer.blogspot.com

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message