hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Templeton (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4308) ContainersAggregated CPU resource utilization reports negative usage in first few heartbeats
Date Tue, 26 Apr 2016 22:59:13 GMT

    [ https://issues.apache.org/jira/browse/YARN-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15259131#comment-15259131
] 

Daniel Templeton commented on YARN-4308:
----------------------------------------

I think it would make sense to test that the negative values are properly ignored.

I saw that [~kasha] said the pathological case of always getting a negative value should not
occur, but I'm a still little concerned about that case.  If it happens, there will be no
externally visible signs as to why the reports are being skipped.  Taking the daemon down
to turn on debugging may well change the state, leaving a confused end user.  Is there a way
that we can drop an obvious flag in the logs if the issue persists?  Like maybe if we skip
_n_ reports in a row, log a warning?

> ContainersAggregated CPU resource utilization reports negative usage in first few heartbeats
> --------------------------------------------------------------------------------------------
>
>                 Key: YARN-4308
>                 URL: https://issues.apache.org/jira/browse/YARN-4308
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.7.1
>            Reporter: Sunil G
>            Assignee: Sunil G
>         Attachments: 0001-YARN-4308.patch, 0002-YARN-4308.patch
>
>
> NodeManager reports ContainerAggregated CPU resource utilization as -ve value in first
few heartbeats cycles. I added a new debug print and received below values from heartbeats.
> {noformat}
> INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
ContainersResource Utilization : CpuTrackerUsagePercent : -1.0 
> INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:ContainersResource
Utilization :  CpuTrackerUsagePercent : 198.94598
> {noformat}
> Its better we send 0 as CPU usage rather than sending a negative values in heartbeats
eventhough its happening in only first few heartbeats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message