hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Min Shen (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-5543) ResourceManager SchedulingMonitor could potentially terminate the preemption checker thread
Date Fri, 19 Aug 2016 23:03:20 GMT
Min Shen created YARN-5543:
------------------------------

             Summary: ResourceManager SchedulingMonitor could potentially terminate the preemption
checker thread
                 Key: YARN-5543
                 URL: https://issues.apache.org/jira/browse/YARN-5543
             Project: Hadoop YARN
          Issue Type: Bug
          Components: capacityscheduler, resourcemanager
    Affects Versions: 2.6.1, 2.7.0
            Reporter: Min Shen


In SchedulingMonitor.java, when the service starts, it starts a checker thread to perform
Capacity Scheduler's preemption. However, the implementation of this checker thread has the
following issue:
{code}
while (!stopped && !Thread.currentThread().isInterrupted()) {
    ....
    try {
      Thread.sleep(monitorInterval)
    } catch (InterruptedException e) {
      ....
      break;
    }
}
{code}
The above code snippet will terminate the checker thread whenever it is interrupted. 
We noticed in our cluster that this could lead to CapacityScheduler's preemption disabled
unexpectedly due to the checker thread getting terminated.

We propose to use ScheduledExecutorService to improve the robustness of this part of the code
to ensure the liveness of CapacityScheduler's preemption functionality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message