hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sultan Alamro (Jira)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-7245) Reduce phase does not continue processing with failed SCHEDULED Map tasks
Date Thu, 31 Oct 2019 01:25:00 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-7245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sultan Alamro updated MAPREDUCE-7245:
-------------------------------------
    Description: 
When we set *mapreduce.map.maxattempts* to 1, the reduce tasks should ignore the output of
failed tasks as state it in EventFetch class. However, it turns out that this only happens
when a map task transitions from RUNNING to FAILED, not from SCHEDULED to FAILED. 

 

I think this problem can be solved in TaskImpl.java file by adding an else statement if there
is not container assigned.

 

if (attempt.getNodeHttpAddress() != null) {
 TaskAttemptCompletionEvent tce = recordFactory
 .newRecordInstance(TaskAttemptCompletionEvent.class);
 tce.setEventId(-1);
 String scheme = (encryptedShuffle) ? "https://" : "http://";
 tce.setMapOutputServerAddress(StringInterner.weakIntern(scheme
 + attempt.getNodeHttpAddress().split(":")[0] + ":"
 + attempt.getShufflePort()));
 tce.setStatus(status);
 tce.setAttemptId(attempt.getID());
 int runTime = 0;
 if (attempt.getFinishTime() != 0 && attempt.getLaunchTime() !=0)
 runTime = (int)(attempt.getFinishTime() - attempt.getLaunchTime());
 tce.setAttemptRunTime(runTime);

//raise the event to job so that it adds the completion event to its
 //data structures
 eventHandler.handle(new JobTaskAttemptCompletedEvent(tce));
 } else {
 TaskAttemptCompletionEvent tce = recordFactory
 .newRecordInstance(TaskAttemptCompletionEvent.class);
 tce.setEventId(-1);
 tce.setStatus(status);
 tce.setAttemptId(attempt.getID());
 eventHandler.handle(new JobTaskAttemptCompletedEvent(tce));
 }

  was:When we set *mapreduce.map.maxattempts* to 1, the reduce tasks should ignore the output
of failed tasks as state it in EventFetch class. However, it turns out that this only happens
when a map task transitions from RUNNING to FAILED, not from SCHEDULED to FAILED


> Reduce phase does not continue processing with failed SCHEDULED Map tasks
> -------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-7245
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7245
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.7.2, 3.2.1
>            Reporter: Sultan Alamro
>            Priority: Major
>
> When we set *mapreduce.map.maxattempts* to 1, the reduce tasks should ignore the output
of failed tasks as state it in EventFetch class. However, it turns out that this only happens
when a map task transitions from RUNNING to FAILED, not from SCHEDULED to FAILED. 
>  
> I think this problem can be solved in TaskImpl.java file by adding an else statement
if there is not container assigned.
>  
> if (attempt.getNodeHttpAddress() != null) {
>  TaskAttemptCompletionEvent tce = recordFactory
>  .newRecordInstance(TaskAttemptCompletionEvent.class);
>  tce.setEventId(-1);
>  String scheme = (encryptedShuffle) ? "https://" : "http://";
>  tce.setMapOutputServerAddress(StringInterner.weakIntern(scheme
>  + attempt.getNodeHttpAddress().split(":")[0] + ":"
>  + attempt.getShufflePort()));
>  tce.setStatus(status);
>  tce.setAttemptId(attempt.getID());
>  int runTime = 0;
>  if (attempt.getFinishTime() != 0 && attempt.getLaunchTime() !=0)
>  runTime = (int)(attempt.getFinishTime() - attempt.getLaunchTime());
>  tce.setAttemptRunTime(runTime);
> //raise the event to job so that it adds the completion event to its
>  //data structures
>  eventHandler.handle(new JobTaskAttemptCompletedEvent(tce));
>  } else {
>  TaskAttemptCompletionEvent tce = recordFactory
>  .newRecordInstance(TaskAttemptCompletionEvent.class);
>  tce.setEventId(-1);
>  tce.setStatus(status);
>  tce.setAttemptId(attempt.getID());
>  eventHandler.handle(new JobTaskAttemptCompletedEvent(tce));
>  }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org


Mime
View raw message