hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Neil Jonkers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-6066) Speculative attempts should not run on the same node as their original attempt
Date Sat, 26 Sep 2015 21:56:04 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14909499#comment-14909499
] 

Neil Jonkers commented on MAPREDUCE-6066:
-----------------------------------------

We also see this with Hadoop 2.6.0 using Capacity scheduler:
>From job conf:
yarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler

Relevant section from the AM logs:

2015-09-10 05:02:23,102 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
TaskAttempt: [attempt_1441860050089_0008_m_000450_0] using containerId: [container_1441860050089_0008_01_000452
on NM: [ip-172-31-8-34.ec2.internal:8041] 
2015-09-10 05:02:23,103 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
attempt_1441860050089_0008_m_000450_0 TaskAttempt Transitioned from ASSIGNED to RUNNING 

2015-09-10 05:09:10,080 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
TaskAttempt: [attempt_1441860050089_0008_m_000450_1] using containerId: [container_1441860050089_0008_01_000558
on NM: [ip-172-31-8-34.ec2.internal:8041] 
2015-09-10 05:09:10,080 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
attempt_1441860050089_0008_m_000450_1 TaskAttempt Transitioned from ASSIGNED to RUNNING 
2015-09-10 05:09:10,080 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator:
ATTEMPT_START task_1441860050089_0008_m_000450 


> Speculative attempts should not run on the same node as their original attempt
> ------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6066
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6066
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: applicationmaster, scheduler
>    Affects Versions: 2.5.0
>            Reporter: Todd Lipcon
>         Attachments: conf.xml
>
>
> I'm seeing a behavior on trunk with fair scheduler enabled where a speculative reduce
attempt is getting run on the same node as its original attempt. This doesn't make sense --
the main reason for speculative execution is to deal with a slow node, so scheduling a second
attempt on the same node would just make the problem worse if anything.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message