hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hitarth trivedi <t.hita...@gmail.com>
Subject Re: yarn cache settings
Date Wed, 28 Jan 2015 20:08:16 GMT
Hi,

What can I do instead ? Should I point my local-dir to something else? If
so, what?

Thanks,
Hitarth

On Tue, Jan 27, 2015 at 9:38 PM, daemeon reiydelle <daemeonr@gmail.com>
wrote:

> If you are running /var/lib/hadoop/tmp dir in the / file system, you may
> want to reconsider that. Disk IO will cause issues with the OS as it
> attempts to use "it's" file system.
>
>
>
> *.......*
>
>
>
>
>
>
> *“Life should not be a journey to the grave with the intention of arriving
> safely in apretty and well preserved body, but rather to skid in broadside
> in a cloud of smoke,thoroughly used up, totally worn out, and loudly
> proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M. ReiydelleUSA
> (+1) 415.501.0198 <%28%2B1%29%20415.501.0198>London (+44) (0) 20 8144 9872
> <%28%2B44%29%20%280%29%2020%208144%209872>*
>
> On Tue, Jan 27, 2015 at 3:46 PM, hitarth trivedi <t.hitarth@gmail.com>
> wrote:
>
>> Hi,
>>
>>
>>
>> We have yarn.nodemanager.local-dirs set to
>> /var/lib/hadoop/tmp/nm-local-dir. This is the directory where the mapreduce
>> jobs store temporary data. On restart of nodemanager, the contents of the
>> directory are deleted. I see the following definitions for  yarn.nodemanager.localizer.cache.target-size-mb
>> (default  to 10240MB) and
>> yarn.nodemanager.localizer.cache.cleanup.interval-ms (600000ms, which is 10
>> min)
>>
>>
>>
>> ·  *yarn.nodemanager.localizer.cache.target-size-mb*: This decides the
>> maximum disk space to be used for localizing resources. (At present there
>> is no individual limit for PRIVATE / APPLICATION / PUBLIC cache. YARN-882
>> <https://issues.apache.org/jira/browse/YARN-882>). Once the total disk
>> size of the cache exceeds this then Deletion service will try to remove
>> files which are not used by any running containers. At present there is no
>> limit (quota) for user cache / public cache / private cache. This limit is
>> applicable to all the disks as a total and is not based on per disk basis.
>>
>> ·  *yarn.nodemanager.localizer.cache.cleanup.interval-ms*: After this
>> interval resource localization service will try to delete the unused
>> resources if total cache size exceeds the configured max-size. Unused
>> resources are those resources which are not referenced by any running
>> container. Every time container requests a resource, container is added
>> into the resources’ reference list. It will remain there until container
>> finishes avoiding accidental deletion of this resource. As a part of
>> container resource cleanup (when container finishes) container will be
>> removed from resources’ reference list. That is why when reference count
>> drops to zero it is an ideal candidate for deletion. The resources will be
>> deleted on LRU basis until current cache size drops below target size.
>>
>>
>>
>> My */var/lib/hadoop/tmp/nm-local-dir *has the allocated size of 5GB. So
>> what I wanted to do is set yarn.nodemanager.localizer.cache.target-size-mb
>> to a lower size of 1GB for testing purposes and let the service delete the
>> folder when it reaches this limit. I was expecting the service to delete
>> the contents once it crosses this limit. But I see that the size is growing
>> beyond this limit, on every run of mapreduce jobs, but the service is not
>> kicking in to delete the contents. The jobs are succeeded and completed. Do
>> I need to do something else?
>>
>>
>>
>> Thanks,
>>
>> Hitarth
>>
>
>

Mime
View raw message