hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Srikanth Kandula (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4056) Bundling: Searching for multiple containers in a single pass over {queues, applications, priorities}
Date Thu, 27 Aug 2015 18:44:46 GMT

    [ https://issues.apache.org/jira/browse/YARN-4056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717267#comment-14717267

Srikanth Kandula commented on YARN-4056:

I looked. Sort of similar but not really. The similarity is that both allow multiple containers
to be allocated within fewer calls. 

The difference is in the policies and the complexity. Bundling allows any arbitrary subset
of 'legit' tasks to be assigned. Whereas assignMultiple simply assigns the first few. For
example, bundling can decide that the 2nd, 3rd and 10th tasks are a good choice in contrast
to assigning just the 1st task (the others may not fit). assignMultiple does not allow for

Bundling is slightly more complex because the actual assignment is deferred till the loop
finishes. Whereas assignMultiple assigns each task in place and keeps going.

Patch is with [~chris.douglas] for an internal review.

We are pushing out a bundler that mimics the current scheduler. All the tests pass and there
is no performance change. As expected. Note however that the allocations are still deferred.

Better bundlers are in the works.

> Bundling: Searching for multiple containers in a single pass over {queues, applications,
> ----------------------------------------------------------------------------------------------------
>                 Key: YARN-4056
>                 URL: https://issues.apache.org/jira/browse/YARN-4056
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: capacityscheduler, resourcemanager, scheduler
>            Reporter: Srikanth Kandula
>            Assignee: Robert Grandl
>         Attachments: bundling.docx
> More than one container is allocated on many NM heartbeats. Yet, the current scheduler
allocates exactly one container per iteration over {{queues, applications, priorities}}. When
there are many queues, applications, or priorities allocating only one container per iteration
can  needlessly increase the duration of the NM heartbeat.
> In this JIRA, we propose bundling. That is, allow arbitrarily many containers to be allocated
in a single iteration over {{queues, applications and priorities}}.

This message was sent by Atlassian JIRA

View raw message