Mailing-List: contact issues-help@mesos.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@mesos.apache.org
Date: Sat, 11 Jul 2015 00:42:04 +0000 (UTC)
From: "Jie Yu (JIRA)" <jira@apache.org>
To: issues@mesos.apache.org
Message-ID: <JIRA.12823196.1429741380000.160502.1436575324561@Atlassian.JIRA>
In-Reply-To: <JIRA.12823196.1429741380000@Atlassian.JIRA>
References: <JIRA.12823196.1429741380000@Atlassian.JIRA>
 <JIRA.12823196.1429741380415@arcas>
Subject: [jira] [Commented] (MESOS-2652) Update Mesos containerizer to
 understand revocable cpu resources
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable


    [ https://issues.apache.org/jira/browse/MESOS-2652?page=3Dcom.atlassian=
.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1462=
3136#comment-14623136 ]=20

Jie Yu commented on MESOS-2652:
-------------------------------

{quote}E.g., high share ratio, revokable is idle, non-revokable consumes a =
ton of cpu time (more than, say, the 1000:1 ratio), then goes idle, revokab=
le then has something to do and starts running =3D=3D> now what happens if =
the non-revokable wants to run? Won't the revokable task continue to run un=
til the share ratio is equalized?{quote}

As far as I know, there's no such preemption mechanism exists in the kernel=
 that we can use. Real time priority allows preemption, but realtime priori=
ty is not compatible with cgroups (http://www.novell.com/support/kb/doc.php=
?id=3D7012851).

{quote}I don't know the answer without reading the scheduler source code bu=
t given that my assumption about SCHED_IDLE turned out to be incomplete/inc=
orrect then let's understand the preemption behavior before committing anot=
her incorrect mechanism{quote}

Yeah, I am using the benchmark I mentioned above to see if the new hierarch=
y works as expected or not. I'll probably add another latency benchmark (e.=
g. http://parsa.epfl.ch/cloudsuite/memcached.html) to see if latency will b=
e affected or not.

But given that we don't have a way to allow kernel preempts revocable tasks=
, setting shares seems to be the only solution.

> Update Mesos containerizer to understand revocable cpu resources
> ----------------------------------------------------------------
>
>                 Key: MESOS-2652
>                 URL: https://issues.apache.org/jira/browse/MESOS-2652
>             Project: Mesos
>          Issue Type: Task
>            Reporter: Vinod Kone
>            Assignee: Ian Downes
>              Labels: twitter
>             Fix For: 0.23.0
>
>         Attachments: Abnormal performance with 3 additional revocable tas=
ks (1).png, Abnormal performance with 3 additional revocable tasks (2).png,=
 Abnormal performance with 3 additional revocable tasks (3).png, Abnormal p=
erformance with 3 additional revocable tasks (4).png, Abnormal performance =
with 3 additional revocable tasks (5).png, Abnormal performance with 3 addi=
tional revocable tasks (6).png, Abnormal performance with 3 additional revo=
cable tasks (7).png, Performance improvement after reducing cpu.share to 2 =
for revocable tasks (1).png, Performance improvement after reducing cpu.sha=
re to 2 for revocable tasks (10).png, Performance improvement after reducin=
g cpu.share to 2 for revocable tasks (2).png, Performance improvement after=
 reducing cpu.share to 2 for revocable tasks (3).png, Performance improveme=
nt after reducing cpu.share to 2 for revocable tasks (4).png, Performance i=
mprovement after reducing cpu.share to 2 for revocable tasks (5).png, Perfo=
rmance improvement after reducing cpu.share to 2 for revocable tasks (6).pn=
g, Performance improvement after reducing cpu.share to 2 for revocable task=
s (7).png, Performance improvement after reducing cpu.share to 2 for revoca=
ble tasks (8).png, Performance improvement after reducing cpu.share to 2 fo=
r revocable tasks (9).png
>
>
> The CPU isolator needs to properly set limits for revocable and non-revoc=
able containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarc=
hy -- normal (non-revocable) and low priority (revocable) subtrees -- and t=
o use a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 sp=
lit (TBD). Containers would be present in only one of the subtrees. CFS quo=
tas will *not* be set on subtree roots, only cpu.shares. Each container wou=
ld set CFS quota and shares as done currently.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)