hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Szilard Nemeth (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-8566) Add diagnostic message for unschedulable containers
Date Tue, 24 Jul 2018 07:48:00 GMT

     [ https://issues.apache.org/jira/browse/YARN-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Szilard Nemeth updated YARN-8566:
    Attachment: YARN-8566.004.patch

> Add diagnostic message for unschedulable containers
> ---------------------------------------------------
>                 Key: YARN-8566
>                 URL: https://issues.apache.org/jira/browse/YARN-8566
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Szilard Nemeth
>            Assignee: Szilard Nemeth
>            Priority: Major
>         Attachments: YARN-8566.001.patch, YARN-8566.002.patch, YARN-8566.003.patch, YARN-8566.004.patch
> If a queue is configured with maxResources set to 0 for a resource, and an application
is submitted to that queue that requests that resource, that application will remain pending
until it is removed or moved to a different queue. This behavior can be realized without extended
resources, but it’s unlikely a user will create a queue that allows 0 memory or CPU. As
the number of resources in the system increases, this scenario will become more common, and
it will become harder to recognize these cases. Therefore, the scheduler should indicate in
the diagnostic string for an application if it was not scheduled because of a 0 maxResources
> Example configuration (fair-scheduler.xml) : 
> {code:java}
> <allocations>
>   <queueMaxAppsDefault>100000</queueMaxAppsDefault>
> <queue name="sample_queue">
>     <minResources>10000 mb,2vcores</minResources>
>     <maxResources>90000 mb,4vcores, 0gpu</maxResources>
>     <maxRunningApps>50</maxRunningApps>
>     <maxAMShare>-1.0f</maxAMShare>
>     <weight>2.0</weight>
>     <schedulingPolicy>fair</schedulingPolicy>
>   </queue>
> </allocations>
> {code}
> Command: 
> {code:java}
> yarn jar "./share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.0-SNAPSHOT.jar" pi -Dmapreduce.job.queuename=sample_queue
-Dmapreduce.map.resource.gpu=1 1 1000;
> {code}
> The job hangs and the application diagnostic info is empty.
> Given that an exception is thrown before any mapper/reducer container is created, the
diagnostic message of the AM should be updated.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message