hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yang Wang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-6966) NodeManager metrics may returning wrong negative values when after restart
Date Tue, 08 Aug 2017 10:23:00 GMT
Yang Wang created YARN-6966:

             Summary: NodeManager metrics may returning wrong negative values when after restart
                 Key: YARN-6966
                 URL: https://issues.apache.org/jira/browse/YARN-6966
             Project: Hadoop YARN
          Issue Type: Bug
            Reporter: Yang Wang

Just as YARN-6212. However, I think it is not a duplicate of YARN-3933.
The primary cause of negative values is that metrics do not recover properly when NM restart.
in metrics also need to recover when NM restart.
This should be done in ContainerManagerImpl#recoverContainer.

The scenario could be reproduction by the following steps:
# Make sure YarnConfiguration.NM_RECOVERY_ENABLED=true,YarnConfiguration.NM_RECOVERY_SUPERVISED=true
in NM
# Submit an application and keep running
# Restart NM
# Stop the application
# Now you get the negative values
name: "Hadoop:service=NodeManager,name=NodeManagerMetrics",
modelerType: "NodeManagerMetrics",
tag.Context: "yarn",
tag.Hostname: "hadoop1111.com",
ContainersLaunched: 0,
ContainersCompleted: 0,
ContainersFailed: 2,
ContainersKilled: 0,
ContainersIniting: 0,
ContainersRunning: 0,
AllocatedGB: 0,
AllocatedContainers: -2,
AvailableGB: 160,
AllocatedVCores: -11,
AvailableVCores: 3611,
ContainerLaunchDurationNumOps: 2,
ContainerLaunchDurationAvgTime: 6,
BadLocalDirs: 0,
BadLogDirs: 0,
GoodLocalDirsDiskUtilizationPerc: 2,
GoodLogDirsDiskUtilizationPerc: 2

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message