hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Varun Saxena (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3508) Preemption processing occuring on the main RM dispatcher
Date Tue, 30 Jun 2015 17:58:05 GMT

    [ https://issues.apache.org/jira/browse/YARN-3508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14608746#comment-14608746

Varun Saxena commented on YARN-3508:

[~leftnoteasy], yes this is what the patch does and addresses the issue in hand. I thought
you want me to put events directly to Scheduler Event Queue and hence bypass Central RM Dispatcher

Anyways just to explain, below is what I have done.
I have removed {{ContainerPreemptEventType}} and moved all these events to {{SchedulerEventType}}.
I have also removed {{RMContainerPreemptEventDispatcher}} because that processes the events
synchronously as part of RMDispatcher Event Thread.

As per patch the events from {{ProportionalCapacityPreemptionPolicy}} are sent to central
RM Dispatcher and as the event type is {{SchedulerEventType}}, the preemption events are sent
to scheduler event queue.
Then the Scheduler Dispatcher picks up these events from scheduler event queue and call {{CapacityScheduler#handle}}
which then calls the relevant method to handle different preemption events.

I added a test case too for this behavior by adding a new test class {{TestRMDispatcher}}.

> Preemption processing occuring on the main RM dispatcher
> --------------------------------------------------------
>                 Key: YARN-3508
>                 URL: https://issues.apache.org/jira/browse/YARN-3508
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager, scheduler
>    Affects Versions: 2.6.0
>            Reporter: Jason Lowe
>            Assignee: Varun Saxena
>         Attachments: YARN-3508.002.patch, YARN-3508.01.patch, YARN-3508.03.patch, YARN-3508.04.patch
> We recently saw the RM for a large cluster lag far behind on the AsyncDispacher event
queue.  The AsyncDispatcher thread was consistently blocked on the highly-contended CapacityScheduler
lock trying to dispatch preemption-related events for RMContainerPreemptEventDispatcher. 
Preemption processing should occur on the scheduler event dispatcher thread or a separate
thread to avoid delaying the processing of other events in the primary dispatcher queue.

This message was sent by Atlassian JIRA

View raw message