hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From zsongbo <zson...@gmail.com>
Subject Re: best practice: mapred.local vs dfs drives
Date Sun, 05 Apr 2009 07:52:30 GMT
I usually set mapred.local.dir to share the disk space with DFS, since some
mapreduce job need big temp space.



On Fri, Apr 3, 2009 at 8:36 PM, Craig Macdonald <craigm@dcs.gla.ac.uk>wrote:

> Hello all,
>
> Following recent hardware discussions, I thought I'd ask a related
> question. Our cluster nodes have 3 drives: 1x 160GB system/scratch and 2x
> 500GB DFS drives.
>
> The 160GB system drive is partitioned such that 100GB is for job
> mapred.local space. However, we find that for our application, mapred.local
> free space for map output space is the limiting parameter on the number of
> reducers we can have (our application prefers less reducers).
>
> How do people normally work for dfs vs mapred.local space. Do you (a) share
> the DFS drives with the task tracker temporary files, Or do you (b) keep
> them on separate partitions or drives?
>
> We originally went with (b) because it prevented a run-away job from eating
> all the DFS space on the machine, however, I'm beginning to realise the
> disadvantages.
>
> Any comments?
>
> Thanks
>
> Craig
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message