incubator-mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Lyubimov <dlie...@gmail.com>
Subject Re: Mesos, lxc and ubuntu 12
Date Thu, 13 Jun 2013 18:35:48 GMT
Thank you for your reply.

Now getting this FATAL message :


F0613 11:27:48.010704 24340 cgroups_isolation_module.cpp:147] Cannot create
cgroups hierarchy root at /sys/fs/cgroup. Consider removing it



On Wed, Jun 12, 2013 at 6:06 PM, Benjamin Mahler
<benjamin.mahler@gmail.com>wrote:

> Can you also pass --cgroups_hierarchy_root=/sys/fs/cgroup to the slave?
>
> In 0.11.0, we don't make an attempt to detect the cgroup hierarchy
> location.
>
>
> On Wed, Jun 12, 2013 at 6:01 PM, Dmitriy Lyubimov <dlieu.7@gmail.com>wrote:
>
>> also, listing cgroup produces :
>> dmitriy@BigHP:~$ lscgroup
>> cpu:/
>> cpu:/sysdefault
>> cpuacct:/
>> cpuacct:/sysdefault
>> devices:/
>> devices:/sysdefault
>> memory:/
>> memory:/sysdefault
>> freezer:/
>> freezer:/sysdefault
>>
>>
>>
>> On Wed, Jun 12, 2013 at 6:01 PM, Dmitriy Lyubimov <dlieu.7@gmail.com>wrote:
>>
>>> also, listing cgroups produces
>>>
>>>
>>>
>>> On Wed, Jun 12, 2013 at 5:42 PM, Dmitriy Lyubimov <dlieu.7@gmail.com>wrote:
>>>
>>>> and here's the list of subsystems if it is relevant.
>>>> lssubsys -am
>>>> cpuset
>>>> cpu /sys/fs/cgroup/cpu
>>>> cpuacct /sys/fs/cgroup/cpuacct
>>>> memory /sys/fs/cgroup/memory
>>>> devices /sys/fs/cgroup/devices
>>>> freezer /sys/fs/cgroup/freezer
>>>> blkio
>>>> perf_event
>>>>
>>>>
>>>>
>>>> On Wed, Jun 12, 2013 at 5:34 PM, Dmitriy Lyubimov <dlieu.7@gmail.com>wrote:
>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Jun 12, 2013 at 5:24 PM, Benjamin Mahler <
>>>>> benjamin.mahler@gmail.com> wrote:
>>>>>
>>>>>> Cgroups does not allow one to mount the same subsystem across cgroup
>>>>>> hierarchies. Do you have multiple cgroup hierarchies present on your
>>>>>> machine?
>>>>>>
>>>>>
>>>>> I frankly know nothing of cgroups. how do i check? not that i know of
>>>>> .
>>>>>
>>>>>>
>>>>>> Ideally this will work with a stock ubuntu 12 OS, but it's possible
>>>>>> that ubuntu already mounts a cgroup hierarchy with the freezer subsystem
in
>>>>>> a location we did not expect.
>>>>>>
>>>>>> What are the contents of the root directory on that machine?
>>>>>>
>>>>>  bin   cdrom  etc     home        initrd.img.old  lib64       media
>>>>>  opt   root  sbin     srv  tmp  var      vmlinuz.old
>>>>> boot  dev    hadoop  initrd.img  lib             lost+found  mnt
>>>>>  proc  run   selinux  sys  usr  vmlinuz
>>>>>
>>>>>
>>>>>>
>>>>>> On Wed, Jun 12, 2013 at 5:19 PM, Dmitriy Lyubimov <dlieu.7@gmail.com>wrote:
>>>>>>
>>>>>>> ok thanks.
>>>>>>>
>>>>>>> now i switched to cgroups and can't get slave to start. The fatal
>>>>>>> error says
>>>>>>>
>>>>>>> F0612 17:17:04.053773 10059 cgroups_isolation_module.cpp:161]
>>>>>>> Required subsystem 'freezer' is already in use
>>>>>>>
>>>>>>> Any hints appreciated.
>>>>>>>
>>>>>>> thank you.
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jun 12, 2013 at 4:59 PM, Vinod Kone <vinodkone@gmail.com>wrote:
>>>>>>>
>>>>>>>> No problem. Instead of giving --isolation=lxc, you could
give
>>>>>>>> --isolation=cgroups. Also for more flags, start mesos slave
with --help.
>>>>>>>> Unfortunately, we have been a bit behind on the documentation,
so the only
>>>>>>>> place you can look at are our header files (e.g.,
>>>>>>>> src/slave/cgroups_isolation.hpp). That said, if your kernel
supports it
>>>>>>>> cgroups should work out of the box with mesos.
>>>>>>>>
>>>>>>>> HTH,
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -- Vinod
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jun 12, 2013 at 4:52 PM, Dmitriy Lyubimov <
>>>>>>>> dlieu.7@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Oops. I am just starting with this. I see it clearly
not working..
>>>>>>>>>  I just downloaded 0.11 and trying to set up spark 0.7.2
with it. it works
>>>>>>>>> ok with "process" isolation. I assumed lxc would be preferrable
since it is
>>>>>>>>> being advertised feature on the Mesos home page.
>>>>>>>>>
>>>>>>>>> I will snoop around the docs looking for cgroups isolation.
If you
>>>>>>>>> can point me to manual, i'd be grateful too.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jun 12, 2013 at 4:48 PM, Vinod Kone <vinodkone@gmail.com>wrote:
>>>>>>>>>
>>>>>>>>>> Hi Dmitry,
>>>>>>>>>>
>>>>>>>>>> What version of mesos are you using? Lxc support
has been
>>>>>>>>>> deprecated for a while now. You should use the new
cgroups isolation.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Jun 12, 2013 at 4:26 PM, Dmitriy Lyubimov
<
>>>>>>>>>> dlieu.7@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hello,
>>>>>>>>>>>
>>>>>>>>>>> is there anything speicific to ubuntu 12 that
needs to be done
>>>>>>>>>>> to make Mesos work with LCX?
>>>>>>>>>>>
>>>>>>>>>>> I set things up according to ubuntu docs,
>>>>>>>>>>> https://help.ubuntu.com/12.10/serverguide/lxc.html#lxc-creation
>>>>>>>>>>>
>>>>>>>>>>> and all container examples there seem to be happily
working.
>>>>>>>>>>>
>>>>>>>>>>> However, some mesos unit tests are failing (which
i suspect are
>>>>>>>>>>> relating to lxc) as well as lxc isolation mode
fails to spawn tasks.
>>>>>>>>>>>
>>>>>>>>>>> (I am actually on ubuntu 12-04 LTS).
>>>>>>>>>>>
>>>>>>>>>>> Is there any speicific way to troubleshoot this?
Is LXC in Mesos
>>>>>>>>>>> even working with Ubuntu 12?
>>>>>>>>>>>
>>>>>>>>>>> thank you in advance. (slave output enclosed).
>>>>>>>>>>> -d
>>>>>>>>>>>
>>>>>>>>>>> I0612 16:24:20.682698 26452 slave.cpp:474] Got
assigned task 0
>>>>>>>>>>> for framework 201306121623-16777343-5050-26417-0000
>>>>>>>>>>> I0612 16:24:20.683425 26452 paths.hpp:234] Created
executor
>>>>>>>>>>> directory
>>>>>>>>>>> '/tmp/mesos/slaves/201306121623-16777343-5050-26417-0/frameworks/201306121623-16777343-5050-26417-0000/executors/Task
>>>>>>>>>>> 0 ("/home/dmitr...)/runs/9156d4fa-a177-464b-906f-fb62c8b9b363'
>>>>>>>>>>> I0612 16:24:20.683630 26453 lxc_isolation_module.cpp:121]
>>>>>>>>>>> Launching Task 0 ("/home/dmitr...)
>>>>>>>>>>> (/usr/local/libexec/mesos/mesos-executor) in
>>>>>>>>>>> /tmp/mesos/slaves/201306121623-16777343-5050-26417-0/frameworks/201306121623-16777343-5050-26417-0000/executors/Task
>>>>>>>>>>> 0 ("/home/dmitr...)/runs/9156d4fa-a177-464b-906f-fb62c8b9b363
with
>>>>>>>>>>> resources ' for framework 201306121623-16777343-5050-26417-0000
>>>>>>>>>>> I0612 16:24:20.683945 26453 lxc_isolation_module.cpp:152]
Forked
>>>>>>>>>>> executor at = 26570
>>>>>>>>>>> lxc-execute: No such file or directory - failed
to create
>>>>>>>>>>> '/sys/fs/cgroup/cpuset//lxc/mesos_executor_Task
0
>>>>>>>>>>> ("/home/dmitr...)_framework_201306121623-16777343-5050-26417-0000'
directory
>>>>>>>>>>> lxc-execute: failed to spawn 'mesos_executor_Task
0
>>>>>>>>>>> ("/home/dmitr...)_framework_201306121623-16777343-5050-26417-0000'
>>>>>>>>>>> lxc-execute: No such file or directory - failed
to remove cgroup
>>>>>>>>>>> '/sys/fs/cgroup/cpuset//lxc/mesos_executor_Task
0
>>>>>>>>>>> ("/home/dmitr...)_framework_201306121623-16777343-5050-26417-0000'
>>>>>>>>>>> I0612 16:24:21.451616 26452 lxc_isolation_module.cpp:322]
>>>>>>>>>>> Telling slave of lost executor Task 0 ("/home/dmitr...)
of framework
>>>>>>>>>>> 201306121623-16777343-5050-26417-0000
>>>>>>>>>>> I0612 16:24:21.451709 26452 lxc_isolation_module.cpp:239]
>>>>>>>>>>> Stopping container mesos_executor_Task 0
>>>>>>>>>>> ("/home/dmitr...)_framework_201306121623-16777343-5050-26417-0000
>>>>>>>>>>> I0612 16:24:21.452199 26454 slave.cpp:998] Executor
'Task 0
>>>>>>>>>>> ("/home/dmitr...)' of framework 201306121623-16777343-5050-26417-0000
has
>>>>>>>>>>> exited with status 255
>>>>>>>>>>> sh: 1: Syntax error: "(" unexpected
>>>>>>>>>>> E0612 16:24:21.453227 26452 lxc_isolation_module.cpp:248]
Failed
>>>>>>>>>>> to stop container mesos_executor_Task 0
>>>>>>>>>>> ("/home/dmitr...)_framework_201306121623-16777343-5050-26417-0000,
lxc-stop
>>>>>>>>>>> returned: 512
>>>>>>>>>>> I0612 16:24:21.453385 26454 slave.cpp:829] Status
update: task 0
>>>>>>>>>>> of framework 201306121623-16777343-5050-26417-0000
is now in state
>>>>>>>>>>> TASK_FAILED
>>>>>>>>>>> E0612 16:24:21.453583 26453 lxc_isolation_module.cpp:273]
ERROR!
>>>>>>>>>>> Asked to update resources for an unknown executor!
>>>>>>>>>>> I0612 16:24:21.453891 26451 gc.cpp:97] Scheduling
>>>>>>>>>>> /tmp/mesos/slaves/201306121623-16777343-5050-26417-0/frameworks/201306121623-16777343-5050-26417-0000/executors/Task
>>>>>>>>>>> 0 ("/home/dmitr...)/runs/9156d4fa-a177-464b-906f-fb62c8b9b363
for removal
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message