hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sunil G (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3136) getTransferredContainers can be a bottleneck during AM registration
Date Thu, 16 Apr 2015 08:37:59 GMT

    [ https://issues.apache.org/jira/browse/YARN-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14497775#comment-14497775

Sunil G commented on YARN-3136:

HI [~jianhe]
Specific tests are not needed for this.
But we need to suppress few find bugs warnings for same. Kindly share your opinion
1. 	Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.applications;
locked 90% of time
Synchronized 90% of the time
Unsynchronized access at AbstractYarnScheduler.java:[line 138]
Unsynchronized access at AbstractYarnScheduler.java:[line 162]
Unsynchronized access at AbstractYarnScheduler.java:[line 230]
2. Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.rmContext;
locked 95% of time
Synchronized 95% of the time
Unsynchronized access at AbstractYarnScheduler.java:[line 140]
Unsynchronized access at AbstractYarnScheduler.java:[line 149]
Synchronized access at AbstractYarnScheduler.java:[line 314]

> getTransferredContainers can be a bottleneck during AM registration
> -------------------------------------------------------------------
>                 Key: YARN-3136
>                 URL: https://issues.apache.org/jira/browse/YARN-3136
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: scheduler
>    Affects Versions: 2.6.0
>            Reporter: Jason Lowe
>            Assignee: Sunil G
>         Attachments: 0001-YARN-3136.patch, 00010-YARN-3136.patch, 0002-YARN-3136.patch,
0003-YARN-3136.patch, 0004-YARN-3136.patch, 0005-YARN-3136.patch, 0006-YARN-3136.patch, 0007-YARN-3136.patch,
0008-YARN-3136.patch, 0009-YARN-3136.patch
> While examining RM stack traces on a busy cluster I noticed a pattern of AMs stuck waiting
for the scheduler lock trying to call getTransferredContainers.  The scheduler lock is highly
contended, especially on a large cluster with many nodes heartbeating, and it would be nice
if we could find a way to eliminate the need to grab this lock during this call.  We've already
done similar work during AM allocate calls to make sure they don't needlessly grab the scheduler
lock, and it would be good to do so here as well, if possible.

This message was sent by Atlassian JIRA

View raw message