hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sunil G (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4481) negative pending resource of queues lead to applications in accepted status inifnitly
Date Fri, 18 Dec 2015 13:16:46 GMT

    [ https://issues.apache.org/jira/browse/YARN-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15063935#comment-15063935
] 

Sunil G commented on YARN-4481:
-------------------------------

Hi [~gu chi]
Could you please help to share debug logs also. We have seen this pblm few cases while using
DRC. It will be really great if you could share RM logs and AM logs.

> negative pending resource of queues lead to applications in accepted status inifnitly
> -------------------------------------------------------------------------------------
>
>                 Key: YARN-4481
>                 URL: https://issues.apache.org/jira/browse/YARN-4481
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler
>    Affects Versions: 2.7.2
>            Reporter: gu-chi
>            Priority: Critical
>         Attachments: jmx.txt
>
>
> Met a scenario of negative pending resource with capacity scheduler, in jmx, it shows:
> {noformat}
>     "PendingMB" : -4096,
>     "PendingVCores" : -1,
>     "PendingContainers" : -1,
> {noformat}
> full jmx infomation attached.
> this is not just a jmx UI issue, the actual pending resource of queue is also negative
as I see the debug log of
> bq. DEBUG | ResourceManager Event Processor | Skip this queue=root, because it doesn't
need more resource, schedulingMode=RESPECT_PARTITION_EXCLUSIVITY node-partition= | ParentQueue.java
> this lead to the {{NULL_ASSIGNMENT}}
> The background is submitting hundreds of applications and consume all cluster resource
and reservation happen. While running, network fault injected by some tool, injection types
are delay,jitter
> ,repeat,packet loss and disorder. And then kill most of the applications submitted.
> Anyone also facing negative pending resource, or have idea of how this happen?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message