hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Owen O'Malley <owen.omal...@gmail.com>
Subject Re: best practice: mapred.local vs dfs drives
Date Sun, 05 Apr 2009 15:33:24 GMT
We always share the drives.

-- Owen

On Apr 5, 2009, at 0:52, zsongbo <zsongbo@gmail.com> wrote:

> I usually set mapred.local.dir to share the disk space with DFS,  
> since some
> mapreduce job need big temp space.
>
>
>
> On Fri, Apr 3, 2009 at 8:36 PM, Craig Macdonald  
> <craigm@dcs.gla.ac.uk>wrote:
>
>> Hello all,
>>
>> Following recent hardware discussions, I thought I'd ask a related
>> question. Our cluster nodes have 3 drives: 1x 160GB system/scratch  
>> and 2x
>> 500GB DFS drives.
>>
>> The 160GB system drive is partitioned such that 100GB is for job
>> mapred.local space. However, we find that for our application,  
>> mapred.local
>> free space for map output space is the limiting parameter on the  
>> number of
>> reducers we can have (our application prefers less reducers).
>>
>> How do people normally work for dfs vs mapred.local space. Do you  
>> (a) share
>> the DFS drives with the task tracker temporary files, Or do you (b)  
>> keep
>> them on separate partitions or drives?
>>
>> We originally went with (b) because it prevented a run-away job  
>> from eating
>> all the DFS space on the machine, however, I'm beginning to realise  
>> the
>> disadvantages.
>>
>> Any comments?
>>
>> Thanks
>>
>> Craig
>>
>>

Mime
View raw message