hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Randy Fox <r...@connexity.com>
Subject Preemption and the rerunning of mappers
Date Wed, 03 Feb 2016 23:42:34 GMT
We have a cluster where we have both quick and long running jobs.  The long running jobs are
mappers that take 1+ hours and reducers that take 1+hours and can take all the resources on
our cluster.

We want a configuration where the quick jobs get resources and not get blocked by the long
running jobs.  We are using the Fair scheduler with the long jobs in one queue and the short
jobs in another queue.

Seems preemption with minResources is the only way to have the quick running jobs get cluster
time.  But not only do they preempt tasks that have been running for a while (gripe: the preemptor
does not seem to kill the jobs that have been running the shortest), but if they preempt a
reducer then mappers need to be rerun as the data is lost and the reducers wait, wasting resources
and causing more preemption.  This creates a nasty cycle and the job never finishes.

I am looking for suggestion on how to make this work.   Is there a way to have the mappers
not delete the intermediate data until the job finishes?  Is there a way to have the preempter
kill the shorted running jobs?  Or am I approaching this entirely wrong?

Thanks in advance for any and all advice.

Cheers,

Randy

PS: Getting a larger cluster is not an option.
Mime
View raw message