Return-Path: X-Original-To: apmail-mesos-issues-archive@minotaur.apache.org Delivered-To: apmail-mesos-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1E82A18859 for ; Fri, 25 Dec 2015 03:04:50 +0000 (UTC) Received: (qmail 3677 invoked by uid 500); 25 Dec 2015 03:04:50 -0000 Delivered-To: apmail-mesos-issues-archive@mesos.apache.org Received: (qmail 3647 invoked by uid 500); 25 Dec 2015 03:04:50 -0000 Mailing-List: contact issues-help@mesos.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mesos.apache.org Delivered-To: mailing list issues@mesos.apache.org Received: (qmail 3596 invoked by uid 99); 25 Dec 2015 03:04:49 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Dec 2015 03:04:49 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id B446F2C1F57 for ; Fri, 25 Dec 2015 03:04:49 +0000 (UTC) Date: Fri, 25 Dec 2015 03:04:49 +0000 (UTC) From: "Yubao Liu (JIRA)" To: issues@mesos.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (MESOS-4248) mesos slave can't start in CentOS-7 docker container MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MESOS-4248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070890#comment-15070890 ] Yubao Liu edited comment on MESOS-4248 at 12/25/15 3:04 AM: ------------------------------------------------------------ The command to start container: {code} docker run --cap-add SYS_ADMIN -e container=docker -dt centos:7 bash -c 'mount | grep /sys/fs/cgroup/ | awk "{print \$3}" | xargs -n 1 umount; find /etc/systemd/system /usr/lib/systemd/system -name "*tty*" -delete; exec /usr/sbin/init' {code} I do "umount" here to get writable cgroup mounts, so that systemd and mesos-slave in container can write there. was (Author: liuyb): The command to start container: {code} docker run --cap-add SYS_ADMIN -e container=docker -dt centos:7 bash -c 'mount | grep /sys/fs/cgroup/ | awk "{print \$3}" | xargs -n 1 umount; find /etc/systemd/system /usr/lib/systemd/system -name "*tty*" -delete; exec /usr/sbin/init' {code} I do "umount" here is to get writable cgroup mounts, so that systemd and mesos-slave in container can write there. > mesos slave can't start in CentOS-7 docker container > ---------------------------------------------------- > > Key: MESOS-4248 > URL: https://issues.apache.org/jira/browse/MESOS-4248 > Project: Mesos > Issue Type: Bug > Components: slave > Affects Versions: 0.26.0 > Environment: My host OS is Debian Jessie, the container OS is CentOS 7.2. > {code} > # cat /etc/system-release > CentOS Linux release 7.2.1511 (Core) > # rpm -qa |grep mesos > mesosphere-zookeeper-3.4.6-0.1.20141204175332.centos7.x86_64 > mesosphere-el-repo-7-1.noarch > mesos-0.26.0-0.2.145.centos701406.x86_64 > $ docker version > Client: > Version: 1.9.1 > API version: 1.21 > Go version: go1.4.2 > Git commit: a34a1d5 > Built: Fri Nov 20 12:59:02 UTC 2015 > OS/Arch: linux/amd64 > Server: > Version: 1.9.1 > API version: 1.21 > Go version: go1.4.2 > Git commit: a34a1d5 > Built: Fri Nov 20 12:59:02 UTC 2015 > OS/Arch: linux/amd64 > {code} > Reporter: Yubao Liu > > // Check the "Environment" label above for kinds of software versions. > "systemctl start mesos-slave" can't start mesos-slave: > {code} > # journalctl -u mesos-slave > .... > Dec 24 10:35:25 mesos-slave1 systemd[1]: Started Mesos Slave. > Dec 24 10:35:25 mesos-slave1 systemd[1]: Starting Mesos Slave... > Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.210180 12838 logging.cpp:172] INFO level logging started! > Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.210603 12838 main.cpp:190] Build: 2015-12-16 23:06:16 by root > Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.210625 12838 main.cpp:192] Version: 0.26.0 > Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.210634 12838 main.cpp:195] Git tag: 0.26.0 > Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.210644 12838 main.cpp:199] Git SHA: d3717e5c4d1bf4fca5c41cd7ea54fae489028faa > Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.210765 12838 containerizer.cpp:142] Using isolation: posix/cpu,posix/mem,filesystem/posix > Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.215638 12838 linux_launcher.cpp:103] Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher > Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.220279 12838 systemd.cpp:128] systemd version `219` detected > Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.227017 12838 systemd.cpp:210] Started systemd slice `mesos_executors.slice` > Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: Failed to create a containerizer: Could not create MesosContainerizer: Failed to create launcher: Failed to locate systemd cgroups hierarchy: does not exist > Dec 24 10:35:25 mesos-slave1 systemd[1]: mesos-slave.service: main process exited, code=exited, status=1/FAILURE > Dec 24 10:35:25 mesos-slave1 systemd[1]: Unit mesos-slave.service entered failed state. > Dec 24 10:35:25 mesos-slave1 systemd[1]: mesos-slave.service failed. > {code} > I used strace to debug it, mesos-slave tried to access "/sys/fs/cgroup/systemd/mesos_executors.slice", but it's actually at "/sys/fs/cgroup/systemd/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope/mesos_executors.slice/", mesos-slave should check "/proc/self/cgroup" to find those intermediate directories: > {code} > # cat /proc/self/cgroup > 8:perf_event:/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope > 7:blkio:/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope > 6:net_cls,net_prio:/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope > 5:freezer:/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope > 4:devices:/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope > 3:cpu,cpuacct:/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope > 2:cpuset:/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope > 1:name=systemd:/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)