hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5767) Fix the order that resources are cleaned up from the local Public/Private caches
Date Mon, 24 Oct 2016 22:06:58 GMT

    [ https://issues.apache.org/jira/browse/YARN-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15603367#comment-15603367
] 

Hadoop QA commented on YARN-5767:
---------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s {color} | {color:blue}
Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green}
The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color}
| {color:green} The patch appears to include 2 new or modified test files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 54s {color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s {color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s {color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 40s {color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 21s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s {color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s {color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color}
| {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
The patch generated 0 new + 160 unchanged - 27 fixed = 160 total (was 187) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s {color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color}
| {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red}
The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 45s {color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 53s {color} | {color:green}
hadoop-yarn-server-nodemanager in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s {color}
| {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 27m 42s {color} | {color:black}
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12835016/YARN-5767-trunk-v2.patch
|
| JIRA Issue | YARN-5767 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  unit  findbugs
 checkstyle  |
| uname | Linux 2f2b8c5e8486 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / a1a0281 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/13492/artifact/patchprocess/whitespace-eol.txt
|
|  Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/13492/testReport/ |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager |
| Console output | https://builds.apache.org/job/PreCommit-YARN-Build/13492/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Fix the order that resources are cleaned up from the local Public/Private caches
> --------------------------------------------------------------------------------
>
>                 Key: YARN-5767
>                 URL: https://issues.apache.org/jira/browse/YARN-5767
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.6.0, 2.7.0, 3.0.0-alpha1
>            Reporter: Chris Trezzo
>            Assignee: Chris Trezzo
>         Attachments: YARN-5767-trunk-v1.patch, YARN-5767-trunk-v2.patch
>
>
> If you look at {{ResourceLocalizationService#handleCacheCleanup}}, you can see that public
resources are added to the {{ResourceRetentionSet}} first followed by private resources:
> {code:java}
> private void handleCacheCleanup(LocalizationEvent event) {
>   ResourceRetentionSet retain =
>     new ResourceRetentionSet(delService, cacheTargetSize);
>   retain.addResources(publicRsrc);
>   if (LOG.isDebugEnabled()) {
>     LOG.debug("Resource cleanup (public) " + retain);
>   }
>   for (LocalResourcesTracker t : privateRsrc.values()) {
>     retain.addResources(t);
>     if (LOG.isDebugEnabled()) {
>       LOG.debug("Resource cleanup " + t.getUser() + ":" + retain);
>     }
>   }
>   //TODO Check if appRsrcs should also be added to the retention set.
> }
> {code}
> Unfortunately, if we look at {{ResourceRetentionSet#addResources}} we see that this means
public resources are deleted first until the target cache size is met:
> {code:java}
> public void addResources(LocalResourcesTracker newTracker) {
>   for (LocalizedResource resource : newTracker) {
>     currentSize += resource.getSize();
>     if (resource.getRefCount() > 0) {
>       // always retain resources in use
>       continue;
>     }
>     retain.put(resource, newTracker);
>   }
>   for (Iterator<Map.Entry<LocalizedResource,LocalResourcesTracker>> i =
>          retain.entrySet().iterator();
>        currentSize - delSize > targetSize && i.hasNext();) {
>     Map.Entry<LocalizedResource,LocalResourcesTracker> rsrc = i.next();
>     LocalizedResource resource = rsrc.getKey();
>     LocalResourcesTracker tracker = rsrc.getValue();
>     if (tracker.remove(resource, delService)) {
>       delSize += resource.getSize();
>       i.remove();
>     }
>   }
> }
> {code}
> The result of this is that resources in the private cache are only deleted in the cases
where:
> # The cache size is larger than the target cache size and the public cache is empty.
> # The cache size is larger than the target cache size and everything in the public cache
is being used by a running container.
> For clusters that primarily use the public cache (i.e. make use of the shared cache),
this means that the most commonly used resources can be deleted before old resources in the
private cache. Furthermore, the private cache can continue to grow over time causing more
and more churn in the public cache.
> Additionally, the same problem exists within the private cache. Since resources are added
to the retention set on a user by user basis, resources will get cleaned up one user at a
time in the order that privateRsrc.values() returns the LocalResourcesTracker. So if user1
has 10MB in their cache and user2 has 100MB in their cache and the target size of the cache
is 50MB, user1 could potentially have their entire cache removed before anything is deleted
from the user2 cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message