hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Sammer <esam...@cloudera.com>
Subject Re: Hadoop Cluster Configuration
Date Sun, 01 Aug 2010 18:39:18 GMT

The configuration for Hadoop is such that the same configuration files can
be distributed to all nodes. Normally, users configure Hadoop appropriate to
their environment and then simply distribute the same configs to all nodes
in the cluster. This is usually much easier than worrying about which config
parameters apply to which daemons.

That said, it is possible to have different configs for different machines,
if necessary. There's currently no documentation that says which parameters
are read by which daemons although it's (usually) possible to figure that
out from the naming convention of the parameters. Parameters in
core-site.xml apply to all daemons, hdfs-site to HDFS, and mapred-site.xml
to the map reduce daemons. As for which parameters are used by which
daemons, dfs.tasktracker.* is used by the task tracker, for instance. Some
are harder to figure out just by their names. Again, I would recommend
distributing the same configs everywhere to make things easier.

On Wed, Jul 28, 2010 at 4:44 AM, vaibhav negi <sssssssenator@gmail.com>wrote:

> Hi,
> I am using hadoop versoion 0.20 . I am trying set up hadoop cluster. What
> are the configurations for name node? What are the configurations for data
> node?
> In documentation all configurations  are given, but, it is not mentioned
> what needs to be configured for which node.
> Vaibhav Negi
> On Wed, Jul 28, 2010 at 1:32 PM, Hemanth Yamijala <yhemanth@gmail.com
> >wrote:
> > Vaibhav,
> >
> > > While setting hadoop cluster, does configuration files
> > (conf/core-site.xml,
> > > conf/mapred-site.xml,conf/hdfs-site.xml) in every node(name node and
> data
> > > nodes)  needs to be configured in the same manner?
> >
> > This is a complicated question to answer :-). There are certain
> > configuration variables that need to be defined to be the same between
> > the master and the slaves and some that don't need to be. Pre Hadoop
> > 0.21, there is no easy way other than documentation for the variables
> > (hopefully) to determine if this is the case or not. I think in Hadoop
> > 0.21 and since, we have tried to adopt a convention to include the
> > daemon name to specify which variables are used by which daemons. And
> > those that are cluster-wide, that need to be the same throughout all
> > the nodes will have something like 'cluster' in the name.
> >
> > Your best bet in any case is possibly to sift through the
> > documentation of the variables you are interested in. Or else to post
> > a query here.
> >
> > > How does configuration of name node differs from configuration of data
> > > nodes?
> >
> > Not sure about this one.
> >
> > Thanks
> > hemanth
> >

Eric Sammer
twitter: esammer
data: www.cloudera.com

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message