hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrzej Bialecki (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-299) maps from second jobs will not run until the first job finishes completely
Date Wed, 14 Jun 2006 07:23:30 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-299?page=comments#action_12416131 ] 

Andrzej Bialecki  commented on HADOOP-299:
------------------------------------------

It's the same issue that I reported earlier on the mailing list (e.g. http://www.mail-archive.com/hadoop-dev@lucene.apache.org/msg01510.html,
http://www.mail-archive.com/hadoop-dev@lucene.apache.org/msg01524.html and http://www.mail-archive.com/hadoop-dev@lucene.apache.org/msg01557.html).

Your patch, although it improves things, could perhaps go one step further if you have some
time to spare... ;) I'm thinking specifically about the following:

* don't schedule reduce tasks from jobs, where map tasks had no chance of running yet. This
happens when there are a couple available slots, and map tasks cannot be scheduled yet, but
the code in JobTracker:743 still allocates reduce tasks from the next job.

* allow simple priority-based preemption, i.e. jobs with a higher priority (presumably short-lived)
could be favored in task allocation over already running jobs.

* alternatively, allow setting limits on the min/max % of cluster capacity the job is willing
to accept. This gives some leeway to the scheduler to allocate the flexible portion of remaining
tasks to old/new jobs.

* allow setting different priorities for maps and reduces - e.g. in Nutch fetcher job, map
tasks are very long running and in many cases need to fit within a specified time-frame (e.g.
during the night). However, reduce tasks, which simply reshuffle the data, are not so time-critical.


> maps from second jobs will not run until the first job finishes completely
> --------------------------------------------------------------------------
>
>          Key: HADOOP-299
>          URL: http://issues.apache.org/jira/browse/HADOOP-299
>      Project: Hadoop
>         Type: Bug

>   Components: mapred
>     Versions: 0.3.2
>     Reporter: Owen O'Malley
>     Assignee: Owen O'Malley
>      Fix For: 0.4.0
>  Attachments: map-schedule.patch
>
> Because of the logic in the JobTracker's pollForNewTask, second jobs will rarely start
running maps until the first job finishes completely. The JobTracker leaves room to re-run
failed maps from the first job and it reserves the total number of maps for the first job.
Thus, if you have more maps in the first job than your cluster capacity, none of the second
job maps will ever run.
> I propose setting the reserve to 1% of the first job's maps.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message