hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Can hadoop.tmp.dir be multivalued?
Date Tue, 18 Dec 2012 19:13:03 GMT
The purpose of the hadoop.tmp.dir is as its name says - for actual,
temporary data. For a more out-of-box experience, such that users have
little trouble configuring to get started, we use it as a base
property for several actual required properties. This is not suitable
for production of course - and is only done for OOB experience.

If you wish to grant your TaskTracker or NodeManager several disks to
parallelize IO upon, use/override their respective local directory
configurations - and quit leveraging the out-of-box hadoop.tmp.dir

Also, what version of Hadoop are you asking your question around? The
property mapreduce.cluster.temp.dir does not exist/is not available in
1.x and is irrelevant in 2.x. It seems to be a legacy property that is
no longer utilized.

On Wed, Dec 19, 2012 at 12:15 AM, anil gupta <anilgupta84@gmail.com> wrote:
> Hi All,
> On my worker nodes i have 10 drives. So, in order to balance disk i/o i
> wanted to evenly distribute the disk read/write load. "hadoop.tmp.dir" is
> used for a lot of things in MR.
> mapreduce.cluster.local.dir${hadoop.tmp.dir}/mapred/localThe local directory
> where MapReduce stores intermediate data files. May be a comma-separated
> list of directories on different devices in order to spread disk i/o.
> Directories that do not exist are ignored.
> mapreduce.jobtracker.system.dir${hadoop.tmp.dir}/mapred/systemThe directory
> where MapReduce stores control files.
> mapreduce.jobtracker.staging.root.dir${hadoop.tmp.dir}/mapred/stagingThe
> root of the staging area for users' job files In practice, this should be
> the directory where users' home directories are located (usually /user)
> mapreduce.cluster.temp.dir${hadoop.tmp.dir}/mapred/tempA shared directory
> for temporary files.
> I am aware that mapreduce.cluster.local.dir can be multivalued and i can
> exlicitly set this property but i was wondering that it would be even better
> if i can set multiple values in hadoop.tmp.dir property. Also, is
> mapreduce.cluster.temp.dir property multivalued or single valued?
> --
> Thanks & Regards,
> Anil Gupta

Harsh J

View raw message