hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arkady Borkovsky <ark...@yahoo-inc.com>
Subject Re: [jira] Commented: (HADOOP-1304) MAX_TASK_FAILURES should be configurable
Date Mon, 30 Apr 2007 23:25:45 GMT
Here is  somewhat different but related issue:

it would be useful to make the framework distinguish between  
deterministic and non-deterministic failures and react differently to  
them.

E.g.
-- in streaming, a Perl script has a syntax error.  There is no need  
to check for this 4*300 times.
-- the same exception (with the same stack) is thrown while  
processing the same record.  (G's MapReduce supposedly is capable to  
skip the offending record at the next attempt, but short of that, why  
keep trying?)

(Of course this is just an optimization, while 1304 is a  
functionality one cannot do without....)

-- ab

On Apr 30, 2007, at 12:34 PM, Arun C Murthy (JIRA) wrote:

>
>     [ https://issues.apache.org/jira/browse/HADOOP-1304? 
> page=com.atlassian.jira.plugin.system.issuetabpanels:comment- 
> tabpanel#action_12492766 ]
>
> Arun C Murthy commented on HADOOP-1304:
> ---------------------------------------
>
> One concern with this 'feature' is that we want a reasonable cap on  
> what the user can set max attempts to, else we could have a  
> situation where a user unknowingly, not maliciously, sets it a very  
> large value - thus the framework is now vulnerable to one wrongly  
> configured job hogging the cluster...
>
> Also, as per a discussion with Doug we could follow lucene's  
> convention of classifying this knob as 'Expert' so as to clearly  
> elucidate it's importance...
>
>> MAX_TASK_FAILURES should be configurable
>> ----------------------------------------
>>
>>                 Key: HADOOP-1304
>>                 URL: https://issues.apache.org/jira/browse/ 
>> HADOOP-1304
>>             Project: Hadoop
>>          Issue Type: Improvement
>>          Components: mapred
>>    Affects Versions: 0.12.3
>>            Reporter: Christian Kunz
>>         Assigned To: Devaraj Das
>>         Attachments: 1304.patch, 1304.patch
>>
>>
>> After a couple of weeks of failed attempts I was able to finish a  
>> large job only after I changed MAX_TASK_FAILURES to a higher  
>> value. In light of HADOOP-1144 (allowing a certain amount of task  
>> failures without failing the job) it would be even better if this  
>> value could be configured separately for mappers and reducers,  
>> because often a success of a job requires the success of all  
>> reducers but not of all mappers.
>
> -- 
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>


Mime
View raw message