Return-Path: X-Original-To: apmail-incubator-mesos-user-archive@minotaur.apache.org Delivered-To: apmail-incubator-mesos-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EA4651092F for ; Thu, 13 Jun 2013 00:00:31 +0000 (UTC) Received: (qmail 28306 invoked by uid 500); 13 Jun 2013 00:00:31 -0000 Delivered-To: apmail-incubator-mesos-user-archive@incubator.apache.org Received: (qmail 28280 invoked by uid 500); 13 Jun 2013 00:00:31 -0000 Mailing-List: contact mesos-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mesos-user@incubator.apache.org Delivered-To: mailing list mesos-user@incubator.apache.org Received: (qmail 28272 invoked by uid 99); 13 Jun 2013 00:00:31 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Jun 2013 00:00:31 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of vinodkone@gmail.com designates 74.125.82.172 as permitted sender) Received: from [74.125.82.172] (HELO mail-we0-f172.google.com) (74.125.82.172) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Jun 2013 00:00:25 +0000 Received: by mail-we0-f172.google.com with SMTP id q56so7532291wes.31 for ; Wed, 12 Jun 2013 17:00:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=x0JIEnLFf2VWLzTPRm1wI4drWtJurF36DPbjVgXJ7UA=; b=iUJYUXx8FT/LKgbJrFrI1/TzyFo38Z346Cve/KN+wL6r9NEF+XdlHK2UxkxIaZJ38G ksC9M6yxCUEMliHmFnWyqwGzJgVJ2lTaZP1rbe51dFKEzhxL4I2Sb+J840HfUuHO598p FRAFj2bnE3Z5Z/8nl4moQpn8lw/v2M0rvnQXAQNtSWni0xweFQ7hBioJPJ6u51Oz5rBm yCxMeH39cxnNcAlJ0J//gr/OPX8KoIN29XKeJ3g+ymKhb4Bb+8Cq76/TlQknf6N3sh/x I5PdFIxeMSKxgIOmln85kerdqdL3wby/d1MYoZ48vnXk0o5Fh1HSwS0ADbsFZWRxIyEU 6BjA== X-Received: by 10.194.216.105 with SMTP id op9mr13600772wjc.17.1371081605142; Wed, 12 Jun 2013 17:00:05 -0700 (PDT) MIME-Version: 1.0 Received: by 10.216.221.135 with HTTP; Wed, 12 Jun 2013 16:59:24 -0700 (PDT) In-Reply-To: References: From: Vinod Kone Date: Wed, 12 Jun 2013 16:59:24 -0700 Message-ID: Subject: Re: Mesos, lxc and ubuntu 12 To: mesos-user@incubator.apache.org Content-Type: multipart/alternative; boundary=089e013cc30408628304defdd064 X-Virus-Checked: Checked by ClamAV on apache.org --089e013cc30408628304defdd064 Content-Type: text/plain; charset=ISO-8859-1 No problem. Instead of giving --isolation=lxc, you could give --isolation=cgroups. Also for more flags, start mesos slave with --help. Unfortunately, we have been a bit behind on the documentation, so the only place you can look at are our header files (e.g., src/slave/cgroups_isolation.hpp). That said, if your kernel supports it cgroups should work out of the box with mesos. HTH, -- Vinod On Wed, Jun 12, 2013 at 4:52 PM, Dmitriy Lyubimov wrote: > Oops. I am just starting with this. I see it clearly not working.. I just > downloaded 0.11 and trying to set up spark 0.7.2 with it. it works ok with > "process" isolation. I assumed lxc would be preferrable since it is being > advertised feature on the Mesos home page. > > I will snoop around the docs looking for cgroups isolation. If you can > point me to manual, i'd be grateful too. > > > > On Wed, Jun 12, 2013 at 4:48 PM, Vinod Kone wrote: > >> Hi Dmitry, >> >> What version of mesos are you using? Lxc support has been deprecated for >> a while now. You should use the new cgroups isolation. >> >> >> >> On Wed, Jun 12, 2013 at 4:26 PM, Dmitriy Lyubimov wrote: >> >>> Hello, >>> >>> is there anything speicific to ubuntu 12 that needs to be done to make >>> Mesos work with LCX? >>> >>> I set things up according to ubuntu docs, >>> https://help.ubuntu.com/12.10/serverguide/lxc.html#lxc-creation >>> >>> and all container examples there seem to be happily working. >>> >>> However, some mesos unit tests are failing (which i suspect are relating >>> to lxc) as well as lxc isolation mode fails to spawn tasks. >>> >>> (I am actually on ubuntu 12-04 LTS). >>> >>> Is there any speicific way to troubleshoot this? Is LXC in Mesos even >>> working with Ubuntu 12? >>> >>> thank you in advance. (slave output enclosed). >>> -d >>> >>> I0612 16:24:20.682698 26452 slave.cpp:474] Got assigned task 0 for >>> framework 201306121623-16777343-5050-26417-0000 >>> I0612 16:24:20.683425 26452 paths.hpp:234] Created executor directory >>> '/tmp/mesos/slaves/201306121623-16777343-5050-26417-0/frameworks/201306121623-16777343-5050-26417-0000/executors/Task >>> 0 ("/home/dmitr...)/runs/9156d4fa-a177-464b-906f-fb62c8b9b363' >>> I0612 16:24:20.683630 26453 lxc_isolation_module.cpp:121] Launching Task >>> 0 ("/home/dmitr...) (/usr/local/libexec/mesos/mesos-executor) in >>> /tmp/mesos/slaves/201306121623-16777343-5050-26417-0/frameworks/201306121623-16777343-5050-26417-0000/executors/Task >>> 0 ("/home/dmitr...)/runs/9156d4fa-a177-464b-906f-fb62c8b9b363 with >>> resources ' for framework 201306121623-16777343-5050-26417-0000 >>> I0612 16:24:20.683945 26453 lxc_isolation_module.cpp:152] Forked >>> executor at = 26570 >>> lxc-execute: No such file or directory - failed to create >>> '/sys/fs/cgroup/cpuset//lxc/mesos_executor_Task 0 >>> ("/home/dmitr...)_framework_201306121623-16777343-5050-26417-0000' directory >>> lxc-execute: failed to spawn 'mesos_executor_Task 0 >>> ("/home/dmitr...)_framework_201306121623-16777343-5050-26417-0000' >>> lxc-execute: No such file or directory - failed to remove cgroup >>> '/sys/fs/cgroup/cpuset//lxc/mesos_executor_Task 0 >>> ("/home/dmitr...)_framework_201306121623-16777343-5050-26417-0000' >>> I0612 16:24:21.451616 26452 lxc_isolation_module.cpp:322] Telling slave >>> of lost executor Task 0 ("/home/dmitr...) of framework >>> 201306121623-16777343-5050-26417-0000 >>> I0612 16:24:21.451709 26452 lxc_isolation_module.cpp:239] Stopping >>> container mesos_executor_Task 0 >>> ("/home/dmitr...)_framework_201306121623-16777343-5050-26417-0000 >>> I0612 16:24:21.452199 26454 slave.cpp:998] Executor 'Task 0 >>> ("/home/dmitr...)' of framework 201306121623-16777343-5050-26417-0000 has >>> exited with status 255 >>> sh: 1: Syntax error: "(" unexpected >>> E0612 16:24:21.453227 26452 lxc_isolation_module.cpp:248] Failed to stop >>> container mesos_executor_Task 0 >>> ("/home/dmitr...)_framework_201306121623-16777343-5050-26417-0000, lxc-stop >>> returned: 512 >>> I0612 16:24:21.453385 26454 slave.cpp:829] Status update: task 0 of >>> framework 201306121623-16777343-5050-26417-0000 is now in state TASK_FAILED >>> E0612 16:24:21.453583 26453 lxc_isolation_module.cpp:273] ERROR! Asked >>> to update resources for an unknown executor! >>> I0612 16:24:21.453891 26451 gc.cpp:97] Scheduling >>> /tmp/mesos/slaves/201306121623-16777343-5050-26417-0/frameworks/201306121623-16777343-5050-26417-0000/executors/Task >>> 0 ("/home/dmitr...)/runs/9156d4fa-a177-464b-906f-fb62c8b9b363 for removal >>> >>> >>> >>> >> > --089e013cc30408628304defdd064 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
No problem. Instead of giving --isolation=3Dlxc, you could= give --isolation=3Dcgroups. Also for more flags, start mesos slave with --= help. Unfortunately, we have been a bit behind on the documentation, so the= only place you can look at are our header files (e.g., src/slave/cgroups_i= solation.hpp). That said, if your kernel supports it cgroups should work ou= t of the box with mesos.=A0

HTH,



-- Vinod


On Wed, Jun 12, 2013 at 4:52 PM, Dmitriy= Lyubimov <dlieu.7@gmail.com> wrote:
Oops. I am just starting with this. I see it clearly not w= orking.. =A0I just downloaded 0.11 and trying to set up spark 0.7.2 with it= . it works ok with "process" isolation. I assumed lxc would be pr= eferrable since it is being advertised feature on the Mesos home page.

I will snoop around the docs looking for cgroups isolation. = If you can point me to manual, i'd be grateful too.=A0



On Wed, Jun 12, 2013 at 4:48 PM, Vinod K= one <vinodkone@gmail.com> wrote:
Hi Dmitry,

What version of mesos are you using? Lxc support has been deprecated for= a while now. You should use the new cgroups isolation.



On Wed, Jun 12, 2013 at 4:26 PM, Dmitriy= Lyubimov <dlieu.7@gmail.com> wrote:
Hello,=A0

is there anything speicific t= o ubuntu 12 that needs to be done to make Mesos work with LCX?=A0

I set things up according to ubuntu docs,=A0https://help.ubuntu.com/12.10/serverguide= /lxc.html#lxc-creation

and all container examples there seem to be happily wor= king.=A0

However, some mesos unit tests are failin= g (which i suspect are relating to lxc) as well as lxc isolation mode fails= to spawn tasks.

(I am actually on ubuntu 12-04 LTS).

Is there any speicific way to troubleshoot this? Is LXC in Mesos eve= n working with Ubuntu 12?

thank you in advance. (slave output enclosed).
-d=A0

I0612 16:24:20.682698 26452 slave.cpp:474] Got assig= ned task 0 for framework 201306121623-16777343-5050-26417-0000
I0612 16:24:20.683425 26452 paths.hpp:234] Created executor directory = '/tmp/mesos/slaves/201306121623-16777343-5050-26417-0/frameworks/201306= 121623-16777343-5050-26417-0000/executors/Task 0 ("/home/dmitr...)/run= s/9156d4fa-a177-464b-906f-fb62c8b9b363'
I0612 16:24:20.683630 26453 lxc_isolation_module.cpp:121] Launching Ta= sk 0 ("/home/dmitr...) (/usr/local/libexec/mesos/mesos-executor) in /t= mp/mesos/slaves/201306121623-16777343-5050-26417-0/frameworks/201306121623-= 16777343-5050-26417-0000/executors/Task 0 ("/home/dmitr...)/runs/9156d= 4fa-a177-464b-906f-fb62c8b9b363 with resources ' for framework 20130612= 1623-16777343-5050-26417-0000
I0612 16:24:20.683945 26453 lxc_isolation_module.cpp:152] Forked execu= tor at =3D 26570
lxc-execute: No such file or directory - failed = to create '/sys/fs/cgroup/cpuset//lxc/mesos_executor_Task 0 ("/hom= e/dmitr...)_framework_201306121623-16777343-5050-26417-0000' directory<= /div>
lxc-execute: failed to spawn 'mesos_executor_Task 0 ("/home/d= mitr...)_framework_201306121623-16777343-5050-26417-0000'
lxc= -execute: No such file or directory - failed to remove cgroup '/sys/fs/= cgroup/cpuset//lxc/mesos_executor_Task 0 ("/home/dmitr...)_framework_2= 01306121623-16777343-5050-26417-0000'
I0612 16:24:21.451616 26452 lxc_isolation_module.cpp:322] Telling slav= e of lost executor Task 0 ("/home/dmitr...) of framework 201306121623-= 16777343-5050-26417-0000
I0612 16:24:21.451709 26452 lxc_isolatio= n_module.cpp:239] Stopping container mesos_executor_Task 0 ("/home/dmi= tr...)_framework_201306121623-16777343-5050-26417-0000
I0612 16:24:21.452199 26454 slave.cpp:998] Executor 'Task 0 ("= ;/home/dmitr...)' of framework 201306121623-16777343-5050-26417-0000 ha= s exited with status 255
sh: 1: Syntax error: "(" unexp= ected
E0612 16:24:21.453227 26452 lxc_isolation_module.cpp:248] Failed to st= op container mesos_executor_Task 0 ("/home/dmitr...)_framework_2013061= 21623-16777343-5050-26417-0000, lxc-stop returned: 512
I0612 16:2= 4:21.453385 26454 slave.cpp:829] Status update: task 0 of framework 2013061= 21623-16777343-5050-26417-0000 is now in state TASK_FAILED
E0612 16:24:21.453583 26453 lxc_isolation_module.cpp:273] ERROR! Asked= to update resources for an unknown executor!
I0612 16:24:21.4538= 91 26451 gc.cpp:97] Scheduling /tmp/mesos/slaves/201306121623-16777343-5050= -26417-0/frameworks/201306121623-16777343-5050-26417-0000/executors/Task 0 = ("/home/dmitr...)/runs/9156d4fa-a177-464b-906f-fb62c8b9b363 for remova= l






--089e013cc30408628304defdd064--