hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haibo Chen (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-8671) Container Launch failed stating "TaskAttempt killed because it ran on unusable node , Container released on a *lost* node"
Date Tue, 21 Aug 2018 08:07:00 GMT

     [ https://issues.apache.org/jira/browse/YARN-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Haibo Chen updated YARN-8671:
-----------------------------
    Labels:   (was: opp)

> Container Launch failed stating "TaskAttempt killed because it ran on unusable node ,
Container released on a *lost* node"
> --------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-8671
>                 URL: https://issues.apache.org/jira/browse/YARN-8671
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: yarn
>    Affects Versions: 3.1.1
>            Reporter: Akshay Agarwal
>            Priority: Major
>
> Pre-requisites:
> {code:java}
> 1. Install HA cluster.
> 2.Set yarn.nodemanager.opportunistic-containers-max-queue-length=(positive integer value)[NodeManager->yarnsite.xml]
> 3. Set yarn.resourcemanager.opportunistic-container-allocation.enabled= true[ResourceManager->yarnsite.xml]
> {code}
>  
> Steps to reproduce:
> {code:java}
> 1.Keep All NodeManagers Up
> 2.Stop 2 Nodemanagers and immediately follow step 3.
> 3.Submit a job with -Dmapreduce.job.num-opportunistic-maps-percent="100" and run with
50 mappers  
> {code}
> Expected Result:
> {code:java}
> Job should be successfull {code}
> Actual Result:
> {code:java}
> Job is getting successfull but some containers are failing stating 
> TaskAttempt killed because it ran on unusable node , Container released on a *lost* node"
> {code}
>  
> Log Details:
> {code:java}
> TaskAttempt killed because it ran on unusable node Container released on a *lost* node
Container launch failed for container_1534149133116_0019_01_000006 : java.net.ConnectException:
Call From hostname/IP to hostname:portNumber failed on connection exception: java.net.ConnectException:

> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message