hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3136) getTransferredContainers can be a bottleneck during AM registration
Date Tue, 02 Jan 2018 18:10:01 GMT

    [ https://issues.apache.org/jira/browse/YARN-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16308446#comment-16308446
] 

Jason Lowe commented on YARN-3136:
----------------------------------

bq. could you please tell me what is the jira about We've already done similar work during
AM allocate calls to make sure they don't needlessly grab the scheduler lock ?

I was not referring to a specific JIRA  but rather the existing structure of the code where
the scheduler drops off allocated containers for the AM to pick up without needing to grab
the scheduler lock.  If you're seeing a lot of blocked IPC threads for AM allocate calls then
I think you should file a new JIRA with the common stack trace(s) showing how it's blocked.
 We can then move the discussion there.

> getTransferredContainers can be a bottleneck during AM registration
> -------------------------------------------------------------------
>
>                 Key: YARN-3136
>                 URL: https://issues.apache.org/jira/browse/YARN-3136
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: scheduler
>    Affects Versions: 2.6.0
>            Reporter: Jason Lowe
>            Assignee: Sunil G
>              Labels: 2.7.2-candidate
>             Fix For: 2.8.0, 2.7.2, 3.0.0-alpha1
>
>         Attachments: 0001-YARN-3136.patch, 00010-YARN-3136.patch, 00011-YARN-3136.patch,
00012-YARN-3136.patch, 00013-YARN-3136.patch, 0002-YARN-3136.patch, 0003-YARN-3136.patch,
0004-YARN-3136.patch, 0005-YARN-3136.patch, 0006-YARN-3136.patch, 0007-YARN-3136.patch, 0008-YARN-3136.patch,
0009-YARN-3136.patch, YARN-3136.branch-2.7.patch
>
>
> While examining RM stack traces on a busy cluster I noticed a pattern of AMs stuck waiting
for the scheduler lock trying to call getTransferredContainers.  The scheduler lock is highly
contended, especially on a large cluster with many nodes heartbeating, and it would be nice
if we could find a way to eliminate the need to grab this lock during this call.  We've already
done similar work during AM allocate calls to make sure they don't needlessly grab the scheduler
lock, and it would be good to do so here as well, if possible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message