hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Friso van Vollenhoven <fvanvollenho...@xebia.com>
Subject Re: Job-wide cleanup functionality?
Date Thu, 03 Feb 2011 18:53:45 GMT
There is this config option:
 <description>Indicates url which will be called on completion of job to inform
              end status of job.
              User can give at most 2 variables with URI : $jobId and $jobStatus.
              If they are present in URI, then they will be replaced by their
              respective values.

That might help you. I have never used it, because it requires running something that listens
on HTTP, which introduces an additional dependency that might fail at some point. As far as
I know there is no API to achieve the same (I spent some time looking for it). If you need
to be absolutely sure that your code runs after the job (as long as the job tracker is still
running, of course), then you'd need to create a patch, I guess...


On 3 feb 2011, at 17:39, David Rosenstrauch wrote:

> Perhaps this has been covered before, but I wasn't able to dig up any info.
> Is there any way to run a custom "job cleanup" for a map/reduce job?  I know that each
map and reduce has a cleanup method, which can be used to clean up at the end of each task.
 But what I want is to run a single cleanup step that runs after all the reducers complete.
 Is there any way to do this in M/R?  I didn't see one.  Would be nice to have an API like
this to work with:
> job.setCleanupClass(MyCleanupClass.class)
> Also, if no such functionality exists, what's my next best workaround to achieve this
on a M/R job?
> Thanks,
> DR

View raw message