hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Lilley <john.lil...@redpoint.net>
Subject RE: Containers and CPU
Date Tue, 02 Jul 2013 19:18:51 GMT
To explain my reasoning, suppose that I have an application that performs some CPU-intensive
calculation, and can scale to multiple cores internally, but it doesn't need those cores all
the time because the CPU-intensive phase is only a part of the overall computation.  I'm not
sure I understand cgroups' CPU control - does it statically mask cores available to processes,
or does it set up a prioritization for access to all available cores?

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Tuesday, July 02, 2013 1:12 PM
To: user@hadoop.apache.org
Subject: RE: Containers and CPU

Sorry, I don't completely follow.
When you say "with cgroups on", is that an attribute of the AM, the Scheduler, or the Site/RM?
 In other words is it site-wide or something that my application can control?
With cgroups on, is there still a way to get my desired behavior?  I'd really like all tasks
to have access to all CPU cores and simply fight it out in the OS thread scheduler.

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Tuesday, July 02, 2013 11:56 AM
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: Re: Containers and CPU

CPU limits are only enforced if cgroups is turned on.  With cgroups on, they are only limited
when there is contention, in which case tasks are given CPU time in proportion to the number
of cores requested for/allocated to them.  Does that make sense?


On Tue, Jul 2, 2013 at 9:50 AM, Chuan Liu <chuanliu@microsoft.com<mailto:chuanliu@microsoft.com>>
I believe this is the default behavior.
By default, only memory limit on resources is enforced.
The capacity scheduler will use DefaultResourceCalculator to compute resource allocation for
containers by default, which also does not take CPU into account.


From: John Lilley [mailto:john.lilley@redpoint.net<mailto:john.lilley@redpoint.net>]
Sent: Tuesday, July 02, 2013 8:57 AM
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: Containers and CPU

I have YARN tasks that benefit from multicore scaling.  However, they don't *always* use more
than one core.  I would like to allocate containers based only on memory, and let each task
use as many cores as needed, without allocating exclusive CPU "slots" in the scheduler.  For
example, on an 8-core node with 16GB memory, I'd like to be able to run 3 tasks each consuming
4GB memory and each using as much CPU as they like.  Is this the default behavior if I don't
specify CPU restrictions to the scheduler?

View raw message