hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (MAPREDUCE-314) Avoid priority inversion that could result due to scheduling running jobs in an order sorted by priority
Date Mon, 21 Jul 2014 16:46:39 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Allen Wittenauer resolved MAPREDUCE-314.
----------------------------------------

    Resolution: Fixed

I'm going to close this as fixed:

# scheduling has been rewritten a few times now
# user limit helps here tremendously

> Avoid priority inversion that could result due to scheduling running jobs in an order
sorted by priority
> --------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-314
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-314
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Hemanth Yamijala
>
> - Consider a job, J1, with priority NORMAL that is running reduce tasks occupying all
reduce slots and has running and pending map tasks. 
> - At this point, suppose a job, J2, is submitted with priority HIGH or say its priority
is changed to HIGH from NORMAL.
> - The schedulers typically will start scheduling tasks from job J2, as J1's running maps
complete. The default scheduler in Hadoop does this, and with HADOOP-4471, so will the capacity
scheduler.
> - However, as there are still pending maps in J1, the reduce tasks of J1 are all stuck
and no reduce tasks of J2 can run. 
> - So, all map tasks of J2 will complete, followed by completion of all map tasks of J1,
and then reduce tasks from J1 will start getting freed for J2 to complete. 
> This could result in jobs completing slowly. Also, if there are enough jobs of higher
priority, they could result in low priority jobs being starved. At the same time more and
more resources (such as intermediate disk space) will get consumed without jobs completing.
> This jira is to discuss and implement a solution for the above problem.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message