hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From oj...@doc.ic.ac.uk
Subject Re: Error reporting from map function
Date Thu, 02 Aug 2007 12:20:27 GMT
Hi Doug,

Thanks for the reply. Could you possibly explain how my program would  
get access to the task reports from each tracker? I've found the  
getMapTaskReports method in the JobClient class, but can't work out  
how to access it other than by creating a new instance of JobClient -  
but then that JobClient would be a differnt one to the one that was  
running my job, so would access a different set of TaskReports?

Quoting Doug Cutting <cutting@apache.org>:

> ojh06@doc.ic.ac.uk wrote:
>> I've written a map task that will on occasion not compute the   
>> correct result. This can easily be detected, at which point I'd   
>> like the map task to report the error and terminate the entire   
>> map/reduce job. Does anyone know of a way I can do this?
> You can easily kill the job from a map task.  Just use the
> mapred.job.id job property to get the job id, then use JobClient to
> kill the job. Reporting the error could be done by setting the task's
> state in the reporter, and then scanning task reports from your job
> client after the job is killed for such state strings.  Or you could
> perhaps just set a counter on the reporter in the map task, and then
> checking that counter on the RunningJob, so that you don't have to scan
> all the tasks.  You might need to sleep a few seconds after setting the
> state or counter before killing the job, so that these reports have a
> chance to make it back to the jobtracker.
> Doug

View raw message