Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 27DB7101BD for ; Sat, 8 Feb 2014 00:16:24 +0000 (UTC) Received: (qmail 53445 invoked by uid 500); 8 Feb 2014 00:16:21 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 53415 invoked by uid 500); 8 Feb 2014 00:16:20 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 53401 invoked by uid 99); 8 Feb 2014 00:16:20 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 08 Feb 2014 00:16:20 +0000 Date: Sat, 8 Feb 2014 00:16:20 +0000 (UTC) From: "Sangjin Lee (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-1692) ConcurrentModificationException in fair scheduler AppSchedulable MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13895294#comment-13895294 ] Sangjin Lee commented on YARN-1692: ----------------------------------- I did an escape analysis on the value maps that are stored in AppSchedulingInfo.requests. The synchronization policy seems a little inconsistent in that for the most part it is really protected by the FSSchedulerApp and FiCaSchedulerApp instances. However, most access is also guarded by the AppSchedulingInfo instance itself. In any case, the intention of the existing code seems to be guarding these maps with the FSSchedulerApp/FiCaSchedulerApp instances. Currently there are three access points that are not guarded by the app instances: - AppSchedulable.updateDemand() (this one) - FSSchedulerApp/FiCaSchedulerApp.getResource(Priority) - FSSchedulerApp/FiCaSchedulerApp.getResourceRequest(Priority,String) I'll create a patch that synchronizes the code with the app instance in these access points. > ConcurrentModificationException in fair scheduler AppSchedulable > ---------------------------------------------------------------- > > Key: YARN-1692 > URL: https://issues.apache.org/jira/browse/YARN-1692 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler > Affects Versions: 2.0.5-alpha > Reporter: Sangjin Lee > > We saw a ConcurrentModificationException thrown in the fair scheduler: > {noformat} > 2014-02-07 01:40:01,978 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Exception in fair scheduler UpdateThread > java.util.ConcurrentModificationException > at java.util.HashMap$HashIterator.nextEntry(HashMap.java:926) > at java.util.HashMap$ValueIterator.next(HashMap.java:954) > at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable.updateDemand(AppSchedulable.java:85) > at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.updateDemand(FSLeafQueue.java:125) > at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.updateDemand(FSParentQueue.java:82) > at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.update(FairScheduler.java:217) > at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$UpdateThread.run(FairScheduler.java:195) > at java.lang.Thread.run(Thread.java:724) > {noformat} > The map that gets returned by FSSchedulerApp.getResourceRequests() are iterated on without proper synchronization. -- This message was sent by Atlassian JIRA (v6.1.5#6160)