Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: mapreduce-issues@hadoop.apache.org
Date: Fri, 13 Jul 2012 21:00:47 +0000 (UTC)
From: "Arun C Murthy (JIRA)" <jira@apache.org>
To: mapreduce-issues@hadoop.apache.org
Message-ID: <1162408072.50358.1342213247486.JavaMail.jiratomcat@issues-vm>
In-Reply-To: <1513010632.3133.1339433923183.JavaMail.jiratomcat@issues-vm>
Subject: [jira] [Commented] (MAPREDUCE-4334) Add support for CPU
 isolation/monitoring of containers
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/MAPREDUCE-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414053#comment-13414053 ] 

Arun C Murthy commented on MAPREDUCE-4334:
------------------------------------------

Andrew - please don't this the wrong way, I certainly am *not* trying to debate taskset v/s cgroups. All I'm saying is 'we need both' for the dominant platforms: RHEL5 and RHEL6. I perfectly understand that you might not have the time or the inclination to do both, and I'm happy to help, personally - supporting just RHEL6 isn't enough.

Given that, we have two options:
# Admin-setup cgroups (outside YARN) 
# YARN handles it on it's own via LCE

Now the pros of using LCE:
# It already exists! Hence it doesn't require any *new* operational requirements. 
# It's consistent for both technologies/platforms we need to support: taskset/RHEL5 and cgroups/RHEL6. 
# Even better, we can use the same for any platform in the future e.g. WindowsContainerExecutor (for e.g. we already have WindowsTaskController in branch-1-win and would need to get ported to branch-2 soon).
# It's *much lesser* overhead on admins - they don't have to create cgroups upfront, they don't have to mount them to get them to survive reboots etc.

Cons:
# Need LCE for non-secure setups. We actually did support LTC without security in branch-1 at some point, happy to discuss.

In the alternate (admin-setup groups) we will _still_ need LCE (or worse, *another* setuid script) to support taskset. To me that is a very bad choice.

As a result, using LCE seems like a significantly superior alternative.

----

Some other comments:

bq. In my mind, the LCE is for starting processes, and should stick to doing that. 

Not true at all, we already use it for container cleanup etc. 

{quote}
4) For cgroups, we could have a second ContainersMonitor plugin which uses a setuid root binary to also mount & create cgroups, freeing the admin from managing them at all.
5) For taskset, we can implement a ContainersMonitor which uses a setuid root binary (potentially the LCE, but perhaps better if it's something else, just to keep the security footprint down) to pin processes to CPUs. This ContainersMonitor will also need the memory enforcement code from the current ContainersMonitorImpl
{quote}

Like I said above, have two ways to do the same when we can do with one *existing* component i.e. LCE seems like a clear choice.

I understand you might not have time to port your work via LCE, I'm happy to either help or take up that work.
                
> Add support for CPU isolation/monitoring of containers
> ------------------------------------------------------
>
>                 Key: MAPREDUCE-4334
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4334
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>            Reporter: Arun C Murthy
>            Assignee: Andrew Ferguson
>         Attachments: MAPREDUCE-4334-pre1.patch, MAPREDUCE-4334-pre2-with_cpu.patch, MAPREDUCE-4334-pre2.patch, MAPREDUCE-4334-pre3-with_cpu.patch, MAPREDUCE-4334-pre3.patch
>
>
> Once we get in MAPREDUCE-4327, it will be important to actually enforce limits on CPU consumption of containers. 
> Several options spring to mind:
> # taskset (RHEL5+)
> # cgroups (RHEL6+)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira