hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <qwertyman...@gmail.com>
Subject Re: Best way to terminate Job early
Date Tue, 02 Nov 2010 14:17:53 GMT
Hello,

On Tue, Nov 2, 2010 at 5:55 PM, Henning Blohm <henning.blohm@zfabrik.de> wrote:
> Hi,
>
>   I have a map/reduce job that may discover on the way, during the map
> phase,
> that continuing is pointless. I am not sure to accomplish a job cancellation
> from
> within the Job. Is there an API for that?

Well nothing in the Tasks can kill a job (or at least I don't know of
a way). But a "Job" can be killed by its driver.

This is the API you need to look at:
http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/RunningJob.html

>
>   A second-best solution would be a way to update the jobs configuration so
> that
> at least the reducer and all later started mappers/combiners would find the
> updated configuration that would indicate that continuing is pointless (but
> some cleanup may still be done by the reducer - for example).

If say a map has failed (due to mapper or combiner), then it can be
monitored via the RunningJob instance itself to decide an action. You
can kill at that point.

>
>   Anyway, to cut things short:
>
> a) Is there a (good) way to cancel on ongoing job from within the Job's
> implementation?

RunningJob.killJob()

> b) Is there a (good) way to update a jobs configuration from within the
> Job's implementation
> so that the update will be seen by later tasks?

Your reducers would run anyway if you set a configuration (somehow, as
am not sure if that's possible). You'll lose more time compared to
killing the job based on the TaskCompletionEvents given by RunningJob
objects when queried (which you can keep doing, over the
ClientProtocol).

>
> Thanks,
>   Henning
>
>



-- 
Harsh J
www.harshj.com

Mime
View raw message