hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-6251) Fix Scheduler locking issue introduced by YARN-6216
Date Fri, 18 Aug 2017 20:11:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16133566#comment-16133566
] 

Wangda Tan commented on YARN-6251:
----------------------------------

Thanks [~asuresh] for the fix.

In general, the fix looks correct and I believe it can solve the problem. However I'm a little
bit worried about semantics:

Even if in the patch, the event is called "ReleaseTempContainer", since it won't release container
immediately, some code path may fail. 

For decrease/demote container, RM tells AM container decreased/demoted first, and decrease
used resource internally, this is fine.

However, if in the code path, we expect resource released before new resource allocated (like
continuous-reservation-looking), otherwise cluster resource will be overflowed temporally.

To avoid misuse this, several naming suggestions:
1) RELEASE_TEMP_CONTAINER (And same for class name)-> RELEASE_CONTAINER (since we don't
have special logic to handle "temp" container). 
2) "handleTempContainerRelease" -> "asyncReleaseContainer". And add Javadocs to the method
to mention, if caller expect containers released before method returns, use completeContainer
instead.

> Fix Scheduler locking issue introduced by YARN-6216
> ---------------------------------------------------
>
>                 Key: YARN-6251
>                 URL: https://issues.apache.org/jira/browse/YARN-6251
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Arun Suresh
>            Assignee: Arun Suresh
>         Attachments: YARN-6251.001.patch, YARN-6251.002.patch, YARN-6251.003.patch, YARN-6251.004.patch
>
>
> Opening to track a locking issue that was uncovered when running a custom SLS AMSimulator.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message