hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sangjin Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1692) ConcurrentModificationException in fair scheduler AppSchedulable
Date Sat, 08 Feb 2014 00:16:20 GMT

    [ https://issues.apache.org/jira/browse/YARN-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13895294#comment-13895294
] 

Sangjin Lee commented on YARN-1692:
-----------------------------------

I did an escape analysis on the value maps that are stored in AppSchedulingInfo.requests.
The synchronization policy seems a little inconsistent in that for the most part it is really
protected by the FSSchedulerApp and FiCaSchedulerApp instances. However, most access is also
guarded by the AppSchedulingInfo instance itself.

In any case, the intention of the existing code seems to be guarding these maps with the FSSchedulerApp/FiCaSchedulerApp
instances. Currently there are three access points that are not guarded by the app instances:
- AppSchedulable.updateDemand() (this one)
- FSSchedulerApp/FiCaSchedulerApp.getResource(Priority)
- FSSchedulerApp/FiCaSchedulerApp.getResourceRequest(Priority,String)

I'll create a patch that synchronizes the code with the app instance in these access points.

> ConcurrentModificationException in fair scheduler AppSchedulable
> ----------------------------------------------------------------
>
>                 Key: YARN-1692
>                 URL: https://issues.apache.org/jira/browse/YARN-1692
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: scheduler
>    Affects Versions: 2.0.5-alpha
>            Reporter: Sangjin Lee
>
> We saw a ConcurrentModificationException thrown in the fair scheduler:
> {noformat}
> 2014-02-07 01:40:01,978 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Exception in fair scheduler UpdateThread
> java.util.ConcurrentModificationException
>         at java.util.HashMap$HashIterator.nextEntry(HashMap.java:926)
>         at java.util.HashMap$ValueIterator.next(HashMap.java:954)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable.updateDemand(AppSchedulable.java:85)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.updateDemand(FSLeafQueue.java:125)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.updateDemand(FSParentQueue.java:82)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.update(FairScheduler.java:217)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$UpdateThread.run(FairScheduler.java:195)
>         at java.lang.Thread.run(Thread.java:724)
> {noformat}
> The map that  gets returned by FSSchedulerApp.getResourceRequests() are iterated on without
proper synchronization.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message