hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sandy Ryza (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
Date Mon, 02 Jun 2014 16:33:03 GMT

    [ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14015518#comment-14015518
] 

Sandy Ryza commented on YARN-1913:
----------------------------------

This is looking good.  A small things.

AppSchedulingInfo is only used to track pending resources.  We should hold amResource in SchedulerApplicationAttempt.

{code}
+          if (! queue.canRunAppAM(app.getAMResource())) {
{code}
Take out space after exclamation point.

{code}
   @Override
+  public boolean checkIfAMResourceUsageOverLimit(Resource usage, Resource maxAMResource)
{
+    return Resources.greaterThan(RESOURCE_CALCULATOR, null, usage, maxAMResource);
+  }
{code}
Simpler to just use "usage.getMemory() > maxAMResource.getMemory()".

{code}
+      if (request.getPriority().equals(RMAppAttemptImpl.AM_CONTAINER_PRIORITY)) {
{code}
I'm a little nervous about using the priority here because apps could unwittingly submit all
requests at that priority.  Can we use SchedulerApplicationAttempt.getLiveContainers().isEmpty()?

> With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
> ------------------------------------------------------------------------------
>
>                 Key: YARN-1913
>                 URL: https://issues.apache.org/jira/browse/YARN-1913
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: scheduler
>    Affects Versions: 2.3.0
>            Reporter: bc Wong
>            Assignee: Wei Yan
>         Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, YARN-1913.patch,
YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, YARN-1913.patch
>
>
> It's possible to deadlock a cluster by submitting many applications at once, and have
all cluster resources taken up by AMs.
> One solution is for the scheduler to limit resources taken up by AMs, as a percentage
of total cluster resources, via a "maxApplicationMasterShare" config.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message