hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5448) Resource in Cluster Metrics is not sum of resources in all nodes of all partitions
Date Fri, 29 Jul 2016 18:03:20 GMT

    [ https://issues.apache.org/jira/browse/YARN-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15399773#comment-15399773
] 

Wangda Tan commented on YARN-5448:
----------------------------------

My 2 cents regarding to show "non usable resource" or not on the cluster metrics:

Assuming an exclusive partition X doesn't assign to any queue, we definitely need to show
partition X with queue hierarchy with warnings in *scheduler page* which user can know it
could be a configuration issue. I think we're all agree this.

I also think we need to show a "non-usable resource" on *cluster metrics*. [~sunilg]'s comment
makes sense to me:
bq. I have X resource in my cluster and cluster resource of web UI is displaying the same
too. But resource allocation is only some % of X. (not for full cluster).

Sometimes admin can find a cluster has some available resource, but nobody can use it. If
luckily the user has good understanding of YARN, he/she may come to check scheduler page and
knows there's a partition that isn't assigned to any queue.

However, in many cases, admin reports this as a potential bug to their vendors or mail list
without any investigation.

I think adding both resource (sum of all active NMs and non-usable resources) should help
admin easier to figure out what happened.

Thoughts? 



> Resource in Cluster Metrics is not sum of resources in all nodes of all partitions
> ----------------------------------------------------------------------------------
>
>                 Key: YARN-5448
>                 URL: https://issues.apache.org/jira/browse/YARN-5448
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler, resourcemanager, webapp
>    Affects Versions: 2.7.2
>            Reporter: Naganarasimha G R
>            Assignee: Naganarasimha G R
>         Attachments: NodesPage.png, schedulerPage.png
>
>
> Currently Resource info from Cluster Metrics are got from Queue Metrics's *available
resource + allocated resource*. Hence if there are some nodes which belongs to partition but
if that partition is not associated with any queue then in the capacity scheduler partition
hierarchy shows this nodes resources under its partition but Cluster metrics doesn't show.

> Apart from this in the Nodes page too Metrics overview table is shown. So if we show
Resource info from Queue Metrics User will not be able to co relate it. (have attached the
images for the same)
> IIUC idea of not showing in the *Metrics overview table* is to highlight that configuration
is not proper. This needs to be some how conveyed through  parititon-by-queue-hierarchy chart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message