hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mapred Learn <mapred.le...@gmail.com>
Subject Re: how to implement error thresholds in a map-reduce job ?
Date Tue, 15 Nov 2011 19:46:52 GMT
Hi Harsh,

My situation is to kill a job when this threshold is reached. If say
threshold is 10. And 2 mappers combined reached this value, how should I
achieve this.

With what you are saying, I think job will fail once a single mapper
reaches that threshold.

Thanks,


On Tue, Nov 15, 2011 at 11:22 AM, Harsh J <harsh@cloudera.com> wrote:

> Mapred,
>
> If you fail a task permanently upon encountering a bad situation, you
> basically end up failing the job as well, automatically. By controlling the
> number of retries (say down to 1 or 2 from 4 default total attempts), you
> can also have it fail the job faster.
>
> Is killing the job immediately a necessity? Why?
>
> I s'pose you could call kill from within the mapper, but I've never seen
> that as necessary in any situation so far. Whats wrong with letting the job
> auto-die as a result of a failing task?
>
>  On 16-Nov-2011, at 12:38 AM, Mapred Learn wrote:
>
>  Thanks David for a step-by-step response but this makes error threshold,
> a per mapper threshold. Is there a way to make it per job so that all
> mappers share this value and increment it as a shared counter ?
>
>
> On Tue, Nov 15, 2011 at 8:12 AM, David Rosenstrauch <darose@darose.net>wrote:
>
>>  On 11/14/2011 06:06 PM, Mapred Learn wrote:
>>
>>> Hi,
>>>
>>> I have a use  case where I want to pass a threshold value to a map-reduce
>>> job. For eg: error records=10.
>>>
>>> I want map-reduce job to fail if total count of error_records in the job
>>> i.e. all mappers, is reached.
>>>
>>> How can I implement this considering that each mapper would be processing
>>> some part of the input data ?
>>>
>>> Thanks,
>>> -JJ
>>>
>>
>> 1) Pass in the threshold value as configuration value of the M/R job.
>> (i.e., job.getConfiguration().setInt(**"error_threshold", 10) )
>>
>> 2) Make your mappers implement the Configurable interface.  This will
>> ensure that every mapper gets passed a copy of the config object.
>>
>> 3) When you implement the setConf() method in your mapper (which
>> Configurable will force you to do), retrieve the threshold value from the
>> config and save it in an instance variable in the mapper.  (i.e., int
>> errorThreshold = conf.getInt("error_threshold") )
>>
>> 4) In the mapper, when an error record occurs, increment a counter and
>> then check if the counter value exceeds the threshold.  If so, throw an
>> exception.  (e.g., if (++numErrors >= errorThreshold) throw new
>> RuntimeException("Too many errors") )
>>
>> The exception will kill the mapper.  Hadoop will attempt to re-run it,
>> but subsequent attempts will also fail for the same reason, and eventually
>> the entire job will fail.
>>
>> HTH,
>>
>> DR
>>
>
>
>

Mime
View raw message