hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun Suresh (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-6251) Fix Scheduler locking issue introduced by YARN-6216
Date Tue, 28 Feb 2017 19:42:45 GMT

     [ https://issues.apache.org/jira/browse/YARN-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Arun Suresh updated YARN-6251:
------------------------------
    Attachment: YARN-6251.001.patch

Uploading fix.

The deadlock is due to the fact that the {{completeContainer()}} method (used to flush resources
of temporary containers created during the update) is called in the AM's allocate thread,
which tries to grab the lock on the queue and app... which can be contended for in the reverse
order by the Scheduler thread on a NODE_UPDATE at the same time.

The proposed solution is: Instead of calling {{completeContainer()}} directly, we send it
as an event to the Scheduler to handle.. This will ensure that the Scheduler is the only entity
that will have the lock.   

> Fix Scheduler locking issue introduced by YARN-6216
> ---------------------------------------------------
>
>                 Key: YARN-6251
>                 URL: https://issues.apache.org/jira/browse/YARN-6251
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Arun Suresh
>            Assignee: Arun Suresh
>             Fix For: 3.0.0-alpha3
>
>         Attachments: YARN-6251.001.patch
>
>
> Opening to track a locking issue that was uncovered when running a custom SLS AMSimulator.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message