hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-5540) scheduler spends too much time looking at empty priorities
Date Tue, 23 Aug 2016 19:17:20 GMT

     [ https://issues.apache.org/jira/browse/YARN-5540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Jason Lowe updated YARN-5540:
    Attachment: YARN-5540.001.patch

The main problem is that a scheduler key is never being removed from the collection of scheduler
keys even when there are no further asks for that key.  There's also separate issue where
we can fail to cleanup the underlying hash map keys underneath a particular scheduler key,
but I believe that's more of a memory issue than a performance issue.  The performance issue
occurs because the inner loop for schedulers is to iterate the scheduler keys, so it's important
to remove keys we know are no longer necessary.

When I first started this patch I tried to clean up everything with the bookkeeping including
all the keys from the underlying requests hashmap.  This made for a much larger patch and
adds new, interesting NPE possibilities since requests could disappear in cases that are impossible
today.  For example the current code goes out of its way to avoid removing the ANY request
for a scheduler key.  As such I decided to focus just on the scheduler key set size problem
which is a more focused patch that should still fix the main problem behind this JIRA.

Attaching a patch for trunk for review.  The main idea is to reference count the various scheduler
keys and remove them once their refcount goes to zero.  We increment the refcount for a key
when the corresponding ANY request goes from zero to non-zero or if there's a container increment
request against that scheduler key when there wasn't one before.  Similarly we decrement the
refcount for a key when the corresponding ANY request goes from non-zero to zero or if there
are no container increment requests when there were some before.  When a scheduler key refcount
goes from 0 to 1 it is inserted in the collection of scheduler keys, and when it goes from
1 to 0 it is removed from the collection.  This also has the nice property that deactivation
checks simply become an isEmpty check on the collection of scheduler keys rather than a loop
over that collection.

Once we're agreed on a version for trunk I'll put up the separate patches for branch-2.8 and
branch-2.7 due to changes from YARN-5392 and YARN-1651, respectively.

> scheduler spends too much time looking at empty priorities
> ----------------------------------------------------------
>                 Key: YARN-5540
>                 URL: https://issues.apache.org/jira/browse/YARN-5540
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacity scheduler, fairscheduler, resourcemanager
>    Affects Versions: 2.7.2
>            Reporter: Nathan Roberts
>            Assignee: Jason Lowe
>         Attachments: YARN-5540.001.patch
> We're starting to see the capacity scheduler run out of scheduling horsepower when running
500-1000 applications on clusters with 4K nodes or so.
> This seems to be amplified by TEZ applications. TEZ applications have many more priorities
(sometimes in the hundreds) than typical MR applications and therefore the loop in the scheduler
which examines every priority within every running application, starts to be a hotspot. The
priorities appear to stay around forever, even when there is no remaining resource request
at that priority causing us to spend a lot of time looking at nothing.
> jstack snippet:
> {noformat}
> "ResourceManager Event Processor" #28 prio=5 os_prio=0 tid=0x00007fc2d453e800 nid=0x22f3
runnable [0x00007fc2a8be2000]
>    java.lang.Thread.State: RUNNABLE
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceRequest(SchedulerApplicationAttempt.java:210)
>         - eliminated <0x00000005e73e5dc0> (a org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:852)
>         - locked <0x00000005e73e5dc0> (a org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp)
>         - locked <0x00000003006fcf60> (a org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:527)
>         - locked <0x00000003001b22f8> (a org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:415)
>         - locked <0x00000003001b22f8> (a org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1224)
>         - locked <0x0000000300041e40> (a org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler)
> {noformat}

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message