hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Trezzo (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5767) Fix the order that resources are cleaned up from the local Public/Private caches
Date Fri, 21 Oct 2016 22:50:58 GMT

    [ https://issues.apache.org/jira/browse/YARN-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15596593#comment-15596593
] 

Chris Trezzo commented on YARN-5767:
------------------------------------

I see two initial high-level approaches to this problem:
# Add all resources from the public and private cache to the retention set at the same time.
Then resources across all caches will be removed using an LRU eviction strategy.
# Use separate retention sets for the public and private caches. This way, an admin can set
separate retention sizes for the public and private cache. Furthermore, the caches will be
isolated from each other and a large file being localized in a user cache will not cause churn
in the public cache or visa versa.

Currently I am leaning towards #2. Please let me know your thoughts! Thanks.

> Fix the order that resources are cleaned up from the local Public/Private caches
> --------------------------------------------------------------------------------
>
>                 Key: YARN-5767
>                 URL: https://issues.apache.org/jira/browse/YARN-5767
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1
>            Reporter: Chris Trezzo
>            Assignee: Chris Trezzo
>
> If you look at {{ResourceLocalizationService#handleCacheCleanup}}, you can see that public
resources are added to the {{ResourceRetentionSet}} first followed by private resources:
> {code:java}
> private void handleCacheCleanup(LocalizationEvent event) {
>   ResourceRetentionSet retain =
>     new ResourceRetentionSet(delService, cacheTargetSize);
>   retain.addResources(publicRsrc);
>   if (LOG.isDebugEnabled()) {
>     LOG.debug("Resource cleanup (public) " + retain);
>   }
>   for (LocalResourcesTracker t : privateRsrc.values()) {
>     retain.addResources(t);
>     if (LOG.isDebugEnabled()) {
>       LOG.debug("Resource cleanup " + t.getUser() + ":" + retain);
>     }
>   }
>   //TODO Check if appRsrcs should also be added to the retention set.
> }
> {code}
> Unfortunately, if we look at {{ResourceRetentionSet#addResources}} we see that this means
public resources are deleted first until the target cache size is met:
> {code:java}
> public void addResources(LocalResourcesTracker newTracker) {
>   for (LocalizedResource resource : newTracker) {
>     currentSize += resource.getSize();
>     if (resource.getRefCount() > 0) {
>       // always retain resources in use
>       continue;
>     }
>     retain.put(resource, newTracker);
>   }
>   for (Iterator<Map.Entry<LocalizedResource,LocalResourcesTracker>> i =
>          retain.entrySet().iterator();
>        currentSize - delSize > targetSize && i.hasNext();) {
>     Map.Entry<LocalizedResource,LocalResourcesTracker> rsrc = i.next();
>     LocalizedResource resource = rsrc.getKey();
>     LocalResourcesTracker tracker = rsrc.getValue();
>     if (tracker.remove(resource, delService)) {
>       delSize += resource.getSize();
>       i.remove();
>     }
>   }
> }
> {code}
> The result of this is that resources in the private cache are only deleted in the cases
where the cache size is larger than the target cache size and the public cache is empty, or
everything in the public cache is being used by a running container. For clusters that primarily
use the public cache (i.e. make use of the shared cache), this means that the most commonly
used resources can be deleted before old resources in the private cache. Furthermore, the
private cache can continue to grow over time causing more and more churn in the public cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message