hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siddharth Seth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4982) AM hung with one pending map task
Date Fri, 08 Feb 2013 09:23:14 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13574346#comment-13574346
] 

Siddharth Seth commented on MAPREDUCE-4982:
-------------------------------------------

Great debugging!

bq. I see some convincing evidence in the AM log that what I suspected is true. There was
one less "Assigned from earlierFailedMaps" entry in the log than there were failed map attempts
that received containers. I see one of them was allocated a normal priority container, although
I'm not sure how from looking at the code.
Think this could happen if there's no node or rack-local tasks for a container. The assignToMap
in branch-0.23 then falls back to pulling an attempt from 'maps' - which could be a previously
failed attempt.

In branch-2, it looks like a container meant for a REDUCE could be allocated to a MAP as well.
Not sure if such a scenario will arise, and what problems it could create.
                
> AM hung with one pending map task
> ---------------------------------
>
>                 Key: MAPREDUCE-4982
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4982
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mr-am
>    Affects Versions: 0.23.6
>            Reporter: Jason Lowe
>
> Saw a job that hung with one pending map task that never ran.  The task was in the SCHEDULED
state with a single attempt that was in the UNASSIGNED state.  The AM looked like it was waiting
for a container from the RM, but the RM was never granting it the one container it needed.
> I suspect the AM botched the container request bookkeeping somehow.  More details to
follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message