hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sangjin Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928
Date Mon, 14 Mar 2016 17:58:33 GMT

    [ https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15193764#comment-15193764

Sangjin Lee commented on YARN-4712:

Also some quick comments on the latest patch:

- l.469-473: We need to note other usages of {{cpuUsageTotalCoresPercentage}}. It is used
in tracking the container resource utilization, as well as passed to {{ContainerMetrics.forContainer()}}.
If we're no longer going to use this for the {{NMTimelinePublisher}}, we might need to it

- l.117: we should change the argument name from {{cpuUsageTotalCoresPercentage}} to {{cpuUsagePercentPerCore}}

> CPU Usage Metric is not captured properly in YARN-2928
> ------------------------------------------------------
>                 Key: YARN-4712
>                 URL: https://issues.apache.org/jira/browse/YARN-4712
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Naganarasimha G R
>            Assignee: Naganarasimha G R
>              Labels: yarn-2928-1st-milestone
>         Attachments: YARN-4712-YARN-2928.v1.001.patch, YARN-4712-YARN-2928.v1.002.patch,
YARN-4712-YARN-2928.v1.003.patch, YARN-4712-YARN-2928.v1.004.patch, YARN-4712-YARN-2928.v1.005.patch
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from {{pTree.getCpuUsagePercent()}}
is ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do the calculation
 i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore /resourceCalculatorPlugin.getNumProcessors()}}
because of which UNAVAILABLE check in {{NMTimelinePublisher.reportContainerResourceUsage}}
is not encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but ContainerMonitor is publishing
decimal values for the CPU usage.

This message was sent by Atlassian JIRA

View raw message