hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From anil gupta <anilgupt...@gmail.com>
Subject Re: Can hadoop.tmp.dir be multivalued?
Date Tue, 18 Dec 2012 19:41:05 GMT
Hi Harsh,

Sorry, i forgot to mention that I am using cdh4.1 and using MRv1. I
got the mapreduce.cluster.temp.dir
property from
http://hadoop.apache.org/docs/mapreduce/current/mapred-default.html. Is it
an incorrect source?
Thanks for the prompt reply.

~Anil

On Tue, Dec 18, 2012 at 11:13 AM, Harsh J <harsh@cloudera.com> wrote:

> The purpose of the hadoop.tmp.dir is as its name says - for actual,
> temporary data. For a more out-of-box experience, such that users have
> little trouble configuring to get started, we use it as a base
> property for several actual required properties. This is not suitable
> for production of course - and is only done for OOB experience.
>
> If you wish to grant your TaskTracker or NodeManager several disks to
> parallelize IO upon, use/override their respective local directory
> configurations - and quit leveraging the out-of-box hadoop.tmp.dir
> default.
>
> Also, what version of Hadoop are you asking your question around? The
> property mapreduce.cluster.temp.dir does not exist/is not available in
> 1.x and is irrelevant in 2.x. It seems to be a legacy property that is
> no longer utilized.
>
> On Wed, Dec 19, 2012 at 12:15 AM, anil gupta <anilgupta84@gmail.com>
> wrote:
> > Hi All,
> >
> > On my worker nodes i have 10 drives. So, in order to balance disk i/o i
> > wanted to evenly distribute the disk read/write load. "hadoop.tmp.dir" is
> > used for a lot of things in MR.
> >
> > mapreduce.cluster.local.dir${hadoop.tmp.dir}/mapred/localThe local
> directory
> > where MapReduce stores intermediate data files. May be a comma-separated
> > list of directories on different devices in order to spread disk i/o.
> > Directories that do not exist are ignored.
> > mapreduce.jobtracker.system.dir${hadoop.tmp.dir}/mapred/systemThe
> directory
> > where MapReduce stores control files.
> > mapreduce.jobtracker.staging.root.dir${hadoop.tmp.dir}/mapred/stagingThe
> > root of the staging area for users' job files In practice, this should be
> > the directory where users' home directories are located (usually /user)
> > mapreduce.cluster.temp.dir${hadoop.tmp.dir}/mapred/tempA shared directory
> > for temporary files.
> >
> > I am aware that mapreduce.cluster.local.dir can be multivalued and i can
> > exlicitly set this property but i was wondering that it would be even
> better
> > if i can set multiple values in hadoop.tmp.dir property. Also, is
> > mapreduce.cluster.temp.dir property multivalued or single valued?
> >
> > --
> > Thanks & Regards,
> > Anil Gupta
>
>
>
> --
> Harsh J
>



-- 
Thanks & Regards,
Anil Gupta

Mime
View raw message