Mailing-List: contact yarn-dev-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: yarn-dev@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of anytek88@gmail.com designates
 209.85.220.48 as permitted sender)
Message-ID: <545F46E4.3030700@gmail.com>
Date: Sun, 09 Nov 2014 11:50:12 +0100
From: Fabio <anytek88@gmail.com>
User-Agent: Mozilla/5.0 (X11; Linux i686;
 rv:24.0) Gecko/20100101 Thunderbird/24.6.0
MIME-Version: 1.0
To: yarn-dev@hadoop.apache.org
Subject: CS code for container allocation policy + a possible inefficiency
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Hi guys,
while exploring the capacity scheduler source code I am having some 
troubles understanding which is the policy behind container allocation. 
As I see in 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue 
it seems that for every app (with pending resource requests) and for 
every priority (under which we actually have requests) the scheduler 
will allocate a *single* container, then moving on in the iteration. 
This sounds very strange to me, since in this way we don't try to 
satisfy all the highest priority requests for the given app before 
moving on, nonetheless this is what happens since I don't see any kind 
of iteration that tries to allocate all the containers required by a 
ResourceRequest at a given priority.
For example, for a given app, if I have a request for n>1 containers at 
the highest priority (as an instance of 
org.apache.hadoop.yarn.api.records.ResourceRequest), the path of calls 
to the procedures required to allocate a container is invoked just once, 
so the next container will be allocated for a lower priority request.
Did I miss a "WHILE there are enough available resources and 
ResourceRequest.getNumContainers()>0, DO allocate a container" or what 
else? Could anyone point me to the code implementing this?

Furthermore, isn't it quite inefficient to allocate resources in this 
way? Basically we are allocating resources to applications in order of 
application submission (they are ordered according to their ID). But in 
this way, if I submit an application when hundreds of other apps are 
running in the same leaf queue, my requests will never be served as long 
as any other app is requiring even just a few resources (despite these 
requests may be submitted after mine). I thought that the "FIFO order 
within a leaf queue" means that requests (and not applications) are 
served in FIFO order. Isn't this last interpretation more reasonable? Is 
there any reason behind the actual implementation?

Regards

Fabio