hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From RafaƂ Radecki <radecki.ra...@gmail.com>
Subject Yarn 2.7.3 - capacity scheduler container allocation to nodes?
Date Wed, 09 Nov 2016 13:37:43 GMT
Hi All.

I have a 4 node cluster on which I run yarn. I created 2 queues "long" and
"short", first with 70% resource allocation, the second with 30%
allocation. Both queues are configured on all available nodes by default.

My memory for yarn per node is ~50GB. Initially I thought that when I will
run tasks in "short" queue yarn will allocate them on all nodes using 30%
of the memory on every node. So for example if I run 20 tasks, 2GB each
(40GB summary), in short queue:
- ~7 first will be scheduled on node1 (14GB total, 30% out of 50GB
available on this node for "short" queue -> 15GB)
- next ~7 tasks will be scheduled on node2
- ~6 remaining tasks will be scheduled on node3
- yarn on node4 will not use any resources assigned to "short" queue.
But this seems not to be the case. At the moment I see that all tasks are
started on node1 and other nodes have no tasks started.

I attached my yarn-site.xml and capacity-scheduler.xml.

Is there a way to force yarn to use configured above thresholds (70% and
30%) per node and not per cluster as a whole? I would like to get a
configuration in which on every node 70% is always available for "short"
queue, 70% for "long" queue and in case any resources are free for a
particular queue they are not used by other queues. Is it possible?


View raw message