hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "qus-jiawei (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1469) ApplicationMaster crash cause the TaskAttemptImpl couldn't handle the TA_TOO_MANY_FETCH_FAILURE at KILLED
Date Tue, 03 Dec 2013 10:20:37 GMT

    [ https://issues.apache.org/jira/browse/YARN-1469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837536#comment-13837536
] 

qus-jiawei commented on YARN-1469:
----------------------------------

OK thanks

> ApplicationMaster crash cause the TaskAttemptImpl  couldn't handle the TA_TOO_MANY_FETCH_FAILURE
at KILLED
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-1469
>                 URL: https://issues.apache.org/jira/browse/YARN-1469
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: qus-jiawei
>         Attachments: job_1384857622207_222215-amlog.txt
>
>
> This bug could happen when using demission command to demission an nodemanager.The detail
is bellow:
> 1.one job running happily on the yarn cluster and some MapTask finish on machine A then
begin to schedule the reduce task.Now,the MapTask's state is successed.
> 2.The hadoop admin demission machine A 's NodeManager.
> 3.The ApplicationMaster find the some MapTask hived finish on a demissioned nodemanager,
change this MapTask 's state to KILLED.
> 4.Some running ReduceTask couldn't get the data from MapTask throw an event TA_TOO_MANY_FETCH_FAILURE
to TaskAttemptImpl.
> 5.TaskAttemptImpl couldn't handle TA_TOO_MANY_FETCH_FAILURE  at KILLED state then throw
an exception,cause the ApplicationMaster turn to ERROR.
> I think TaskAttemptImpl could just ignore the TA_TOO_MANY_FETCH_FAILURE  event at KILLED
state 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message