Return-Path: X-Original-To: apmail-mesos-issues-archive@minotaur.apache.org Delivered-To: apmail-mesos-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B61631866A for ; Sat, 11 Jul 2015 00:42:04 +0000 (UTC) Received: (qmail 73110 invoked by uid 500); 11 Jul 2015 00:42:04 -0000 Delivered-To: apmail-mesos-issues-archive@mesos.apache.org Received: (qmail 73074 invoked by uid 500); 11 Jul 2015 00:42:04 -0000 Mailing-List: contact issues-help@mesos.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mesos.apache.org Delivered-To: mailing list issues@mesos.apache.org Received: (qmail 73064 invoked by uid 99); 11 Jul 2015 00:42:04 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 11 Jul 2015 00:42:04 +0000 Date: Sat, 11 Jul 2015 00:42:04 +0000 (UTC) From: "Jie Yu (JIRA)" To: issues@mesos.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MESOS-2652?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1462= 3136#comment-14623136 ]=20 Jie Yu commented on MESOS-2652: ------------------------------- {quote}E.g., high share ratio, revokable is idle, non-revokable consumes a = ton of cpu time (more than, say, the 1000:1 ratio), then goes idle, revokab= le then has something to do and starts running =3D=3D> now what happens if = the non-revokable wants to run? Won't the revokable task continue to run un= til the share ratio is equalized?{quote} As far as I know, there's no such preemption mechanism exists in the kernel= that we can use. Real time priority allows preemption, but realtime priori= ty is not compatible with cgroups (http://www.novell.com/support/kb/doc.php= ?id=3D7012851). {quote}I don't know the answer without reading the scheduler source code bu= t given that my assumption about SCHED_IDLE turned out to be incomplete/inc= orrect then let's understand the preemption behavior before committing anot= her incorrect mechanism{quote} Yeah, I am using the benchmark I mentioned above to see if the new hierarch= y works as expected or not. I'll probably add another latency benchmark (e.= g. http://parsa.epfl.ch/cloudsuite/memcached.html) to see if latency will b= e affected or not. But given that we don't have a way to allow kernel preempts revocable tasks= , setting shares seems to be the only solution. > Update Mesos containerizer to understand revocable cpu resources > ---------------------------------------------------------------- > > Key: MESOS-2652 > URL: https://issues.apache.org/jira/browse/MESOS-2652 > Project: Mesos > Issue Type: Task > Reporter: Vinod Kone > Assignee: Ian Downes > Labels: twitter > Fix For: 0.23.0 > > Attachments: Abnormal performance with 3 additional revocable tas= ks (1).png, Abnormal performance with 3 additional revocable tasks (2).png,= Abnormal performance with 3 additional revocable tasks (3).png, Abnormal p= erformance with 3 additional revocable tasks (4).png, Abnormal performance = with 3 additional revocable tasks (5).png, Abnormal performance with 3 addi= tional revocable tasks (6).png, Abnormal performance with 3 additional revo= cable tasks (7).png, Performance improvement after reducing cpu.share to 2 = for revocable tasks (1).png, Performance improvement after reducing cpu.sha= re to 2 for revocable tasks (10).png, Performance improvement after reducin= g cpu.share to 2 for revocable tasks (2).png, Performance improvement after= reducing cpu.share to 2 for revocable tasks (3).png, Performance improveme= nt after reducing cpu.share to 2 for revocable tasks (4).png, Performance i= mprovement after reducing cpu.share to 2 for revocable tasks (5).png, Perfo= rmance improvement after reducing cpu.share to 2 for revocable tasks (6).pn= g, Performance improvement after reducing cpu.share to 2 for revocable task= s (7).png, Performance improvement after reducing cpu.share to 2 for revoca= ble tasks (8).png, Performance improvement after reducing cpu.share to 2 fo= r revocable tasks (9).png > > > The CPU isolator needs to properly set limits for revocable and non-revoc= able containers. > The proposed strategy is to use a two-way split of the cpu cgroup hierarc= hy -- normal (non-revocable) and low priority (revocable) subtrees -- and t= o use a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 sp= lit (TBD). Containers would be present in only one of the subtrees. CFS quo= tas will *not* be set on subtree roots, only cpu.shares. Each container wou= ld set CFS quota and shares as done currently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)