hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-1073) NM to recognise when it can't span process and stop accepting containers
Date Fri, 16 Aug 2013 20:18:48 GMT
Steve Loughran created YARN-1073:
------------------------------------

             Summary: NM to recognise when it can't span process and stop accepting containers
                 Key: YARN-1073
                 URL: https://issues.apache.org/jira/browse/YARN-1073
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: nodemanager
    Affects Versions: 2.1.0-beta
         Environment: OS/X with not enough file handles
            Reporter: Steve Loughran


when creating too many containers with a claimed resource use of 0 RAM or vCores, the NM got
to the state where exec() was continually failing -but nothing seemed to recognise this and
blacklist the node.

Something should be noting that all container launches for an app/container are failing and
do something. While AMs can/should code this, NM failure is something at the YARN-level

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message