hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rafał Radecki <radecki.ra...@gmail.com>
Subject Re: Yarn 2.7.3 - capacity scheduler container allocation to nodes?
Date Thu, 10 Nov 2016 06:58:51 GMT
Hi Ravi.

I did not specify labels this time ;) I just created two queues as it is
visible in the configuration.
Overall queues work but allocation of jobs is different then expected by me
as I wrote at the beginning.


2016-11-10 2:48 GMT+01:00 Ravi Prakash <ravihadoop@gmail.com>:

> Hi Rafal!
> Have you been able to launch the job successfully first without
> configuring node-labels? Do you really need node-labels? How much total
> memory do you have on the cluster? Node labels are usually for specifying
> special capabilities of the nodes (e.g. some nodes could have GPUs and your
> application could request to be run on only the nodes which have GPUs)
> Ravi
> On Wed, Nov 9, 2016 at 5:37 AM, Rafał Radecki <radecki.rafal@gmail.com>
> wrote:
>> Hi All.
>> I have a 4 node cluster on which I run yarn. I created 2 queues "long"
>> and "short", first with 70% resource allocation, the second with 30%
>> allocation. Both queues are configured on all available nodes by default.
>> My memory for yarn per node is ~50GB. Initially I thought that when I
>> will run tasks in "short" queue yarn will allocate them on all nodes using
>> 30% of the memory on every node. So for example if I run 20 tasks, 2GB each
>> (40GB summary), in short queue:
>> - ~7 first will be scheduled on node1 (14GB total, 30% out of 50GB
>> available on this node for "short" queue -> 15GB)
>> - next ~7 tasks will be scheduled on node2
>> - ~6 remaining tasks will be scheduled on node3
>> - yarn on node4 will not use any resources assigned to "short" queue.
>> But this seems not to be the case. At the moment I see that all tasks are
>> started on node1 and other nodes have no tasks started.
>> I attached my yarn-site.xml and capacity-scheduler.xml.
>> Is there a way to force yarn to use configured above thresholds (70% and
>> 30%) per node and not per cluster as a whole? I would like to get a
>> configuration in which on every node 70% is always available for "short"
>> queue, 70% for "long" queue and in case any resources are free for a
>> particular queue they are not used by other queues. Is it possible?
>> BR,
>> Rafal.
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
>> For additional commands, e-mail: user-help@hadoop.apache.org

View raw message