hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3508) Preemption processing occuring on the main RM dispatcher
Date Mon, 29 Jun 2015 22:42:05 GMT

    [ https://issues.apache.org/jira/browse/YARN-3508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606576#comment-14606576
] 

Wangda Tan commented on YARN-3508:
----------------------------------

[~varun_saxena],
It seems to me this patch doesn't work as what I expected:

I think we PreemptionPolicy should directly send events to scheduler event dispatcher, so
events won't go to rmDispatcher. What [~jlowe]/[~jianhe] suggested should be the same.

Now my understanding of this patch is, preemption events still go to rmDispatcher, and rmDispatcher
sends events to schedulerDispatcher:

{code}
  @SuppressWarnings("unchecked")
  public void serviceInit(Configuration conf) throws Exception {
    scheduleEditPolicy.init(conf, rmContext.getDispatcher().getEventHandler(),
        (PreemptableResourceScheduler) rmContext.getScheduler());
    this.monitorInterval = scheduleEditPolicy.getMonitoringInterval();
    super.serviceInit(conf);
  }
{code}

I think one solution is exposing scheduler events dispatcher to RMContext interface, so schedulerEditPolicy
can use the schedulerDispatcher do initialization.

And could you take a look at failed tests, it looks like related to the patch.

> Preemption processing occuring on the main RM dispatcher
> --------------------------------------------------------
>
>                 Key: YARN-3508
>                 URL: https://issues.apache.org/jira/browse/YARN-3508
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager, scheduler
>    Affects Versions: 2.6.0
>            Reporter: Jason Lowe
>            Assignee: Varun Saxena
>         Attachments: YARN-3508.002.patch, YARN-3508.01.patch, YARN-3508.03.patch, YARN-3508.04.patch
>
>
> We recently saw the RM for a large cluster lag far behind on the AsyncDispacher event
queue.  The AsyncDispatcher thread was consistently blocked on the highly-contended CapacityScheduler
lock trying to dispatch preemption-related events for RMContainerPreemptEventDispatcher. 
Preemption processing should occur on the scheduler event dispatcher thread or a separate
thread to avoid delaying the processing of other events in the primary dispatcher queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message