hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matei Zaharia (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-706) Support for FIFO pools in the fair scheduler
Date Wed, 05 Aug 2009 00:26:15 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739268#action_12739268

Matei Zaharia commented on MAPREDUCE-706:

Thanks for the review, Aaron! Here is a new patch taking into account your comments:

Task task = new ReduceTask("", attemptId, 0, maps.length, 1) <-- shouldn't this be "reduces.length"

This actually needs to be the number of maps according to the ReduceTask API; it lets it create
a data structure for each map.

bq. TestFairScheduler.getLocalityLevel(): These locality level constants are used throughout
the FairScheduler; they should be converted to an Enum. (Magic constants are evil.)

Fixed. I added a LocalityLevel enum with methods for getting the locality level of a given
task and converting a locality level to a cache level cap for obtainNewMapTask.

bq. TestComputeFairShares.testEmptyList() - should this call verifyShares() after computeFairShares()
to assert that the list length is zero?


bq. PoolManager.parseSchedulingMode(): why case sensitive 'fifo' and 'fair' ? maybe use toLower()


bq. PoolSchedulable c'tor: scheduler.getClock().getTime() should be called only once to guarantee
this.lastTimeAtMinShare == this.lastTimeAtHalfFairShare on start?


bq. assignTask(): Is SchedulingMode guaranteed to never be extended by another internal algorithm?
If not, turn "else" into "else if" and have an "else throw InvalidArgumentException" at the
end of the case.

Fixed. Throws RuntimeException if there's another mode, which should cause the JobTracker
to exit and make it obvious that there's something deeply wrong.

bq. JobSchedulable.updateDemand(): why does this use System.currentTimeMillis() instead of
getting the time from a Clock object?


bq. Schedulable's class javadoc: typo "algoirthms"


bq. SchuldingAlgorithms.LOG: rather than use a string, use SchedulingAlgorithms.class.getName()

Fixed. Also fixed this in PoolSchedulable.

bq. FairScheduler.UpdateThread.run(): why is preemptTasksIfNecessary() commented out? Needs
a comment for rationale.

Oops! Fixed. I had commented that out for debugging.

bq. FairScheduler.assignTasks() - Should convert System.out.println to log msg.

Removed the println, it was also for debugging.

bq. This method is also getting pretty long. Consider refactoring the inner loop into shorter
methods if you need to add anything else to it in the future.

I've refactored it to take out the delay scheduling code and the cap calculation code. It's
now closer to fitting on one screenful.

You have the comment: // Job not in infos (shouldn't happen)- 
... So throw an exception if it does, or at least log this e

Fixed. I log an error so the scheduler doesn't crash if for some reason a bug does cause a
job to go through there when it has no info set.

In addition to these fixes, I've added a bit more documentation on the delay scheduling methods
in this version of the patch, and I've changed Schedulable.assignTasks to take the current
time as a parameter so that we don't have multiple calls to System.currentTimeMillis on each
heartbeat (in case that hurts performance).

> Support for FIFO pools in the fair scheduler
> --------------------------------------------
>                 Key: MAPREDUCE-706
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-706
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/fair-share
>            Reporter: Matei Zaharia
>            Assignee: Matei Zaharia
>         Attachments: fsdesigndoc.pdf, fsdesigndoc.tex, mapreduce-706.patch, mapreduce-706.v1.patch,
mapreduce-706.v2.patch, mapreduce-706.v3.patch
> The fair scheduler should support making the internal scheduling algorithm for some pools
be FIFO instead of fair sharing in order to work better for batch workloads. FIFO pools will
behave exactly like the current default scheduler, sorting jobs by priority and then submission
time. Pools will have their scheduling algorithm set through the pools config file, and it
will be changeable at runtime.
> To support this feature, I'm also changing the internal logic of the fair scheduler to
no longer use deficits. Instead, for fair sharing, we will assign tasks to the job farthest
below its share as a ratio of its share. This is easier to combine with other scheduling algorithms
and leads to a more stable sharing situation, avoiding unfairness issues brought up in MAPREDUCE-543
and MAPREDUCE-544 that happen when some jobs have long tasks. The new preemption (MAPREDUCE-551)
will ensure that critical jobs can gain their fair share within a bounded amount of time.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message