hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From daemeon reiydelle <daeme...@gmail.com>
Subject Re: yarn cache settings
Date Wed, 28 Jan 2015 02:38:30 GMT
If you are running /var/lib/hadoop/tmp dir in the / file system, you may
want to reconsider that. Disk IO will cause issues with the OS as it
attempts to use "it's" file system.


*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather to skid in broadside
in a cloud of smoke,thoroughly used up, totally worn out, and loudly
proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M. ReiydelleUSA
(+1) 415.501.0198London (+44) (0) 20 8144 9872*

On Tue, Jan 27, 2015 at 3:46 PM, hitarth trivedi <t.hitarth@gmail.com>

> Hi,
> We have yarn.nodemanager.local-dirs set to
> /var/lib/hadoop/tmp/nm-local-dir. This is the directory where the mapreduce
> jobs store temporary data. On restart of nodemanager, the contents of the
> directory are deleted. I see the following definitions for  yarn.nodemanager.localizer.cache.target-size-mb
> (default  to 10240MB) and
> yarn.nodemanager.localizer.cache.cleanup.interval-ms (600000ms, which is 10
> min)
> ·  *yarn.nodemanager.localizer.cache.target-size-mb*: This decides the
> maximum disk space to be used for localizing resources. (At present there
> is no individual limit for PRIVATE / APPLICATION / PUBLIC cache. YARN-882
> <https://issues.apache.org/jira/browse/YARN-882>). Once the total disk
> size of the cache exceeds this then Deletion service will try to remove
> files which are not used by any running containers. At present there is no
> limit (quota) for user cache / public cache / private cache. This limit is
> applicable to all the disks as a total and is not based on per disk basis.
> ·  *yarn.nodemanager.localizer.cache.cleanup.interval-ms*: After this
> interval resource localization service will try to delete the unused
> resources if total cache size exceeds the configured max-size. Unused
> resources are those resources which are not referenced by any running
> container. Every time container requests a resource, container is added
> into the resources’ reference list. It will remain there until container
> finishes avoiding accidental deletion of this resource. As a part of
> container resource cleanup (when container finishes) container will be
> removed from resources’ reference list. That is why when reference count
> drops to zero it is an ideal candidate for deletion. The resources will be
> deleted on LRU basis until current cache size drops below target size.
> My */var/lib/hadoop/tmp/nm-local-dir *has the allocated size of 5GB. So
> what I wanted to do is set yarn.nodemanager.localizer.cache.target-size-mb
> to a lower size of 1GB for testing purposes and let the service delete the
> folder when it reaches this limit. I was expecting the service to delete
> the contents once it crosses this limit. But I see that the size is growing
> beyond this limit, on every run of mapreduce jobs, but the service is not
> kicking in to delete the contents. The jobs are succeeded and completed. Do
> I need to do something else?
> Thanks,
> Hitarth

View raw message