Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id F39A8200CA6 for ; Mon, 8 May 2017 20:59:15 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id F2489160BD2; Mon, 8 May 2017 18:59:15 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 6CAD1160BA2 for ; Mon, 8 May 2017 20:59:14 +0200 (CEST) Received: (qmail 33225 invoked by uid 500); 8 May 2017 18:59:11 -0000 Mailing-List: contact issues-help@mesos.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mesos.apache.org Delivered-To: mailing list issues@mesos.apache.org Received: (qmail 33216 invoked by uid 99); 8 May 2017 18:59:11 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 May 2017 18:59:11 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 407BC188A80 for ; Mon, 8 May 2017 18:59:11 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -98.001 X-Spam-Level: X-Spam-Status: No, score=-98.001 tagged_above=-999 required=6.31 tests=[KAM_BADIPHTTP=2, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100, WEIRD_PORT=0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id zxwQ1LD2bf3h for ; Mon, 8 May 2017 18:59:05 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 236045F1B3 for ; Mon, 8 May 2017 18:59:05 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 69643E06CC for ; Mon, 8 May 2017 18:59:04 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 1892721DF6 for ; Mon, 8 May 2017 18:59:04 +0000 (UTC) Date: Mon, 8 May 2017 18:59:04 +0000 (UTC) From: "Joseph Wu (JIRA)" To: issues@mesos.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (MESOS-7466) Mesos marathon and docker not synchronized MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 08 May 2017 18:59:16 -0000 [ https://issues.apache.org/jira/browse/MESOS-7466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Wu updated MESOS-7466: ----------------------------- Docs Text: (was: I submitted an application group in marathon, and then I deleted the application group because there was some problem with the definition of the application group, and then I submitted the application group again via the marathon api. After multiple deletions and multiple submissions, I found some problems (this is the third time I encountered the problem), I deleted the application group in the marathon UI, found in the `deployments` there are some content, and can not be deleted, suggesting that `error destroying null: app '/ Null' does not exist `; in the mesos ui task status is `running`; docker container status is `up`. Then, I restart mesos-master, mesos-slave, marathon and zookeeper, the result is the same. I suggested that I clear the marathon in the zookeeper and then I did; marathon was back to normal, just like reinstalling it. But the tasks I want to delete are still in mesos ui and the state is `running`. After that, I cleared the meso in zookeeper, and the result is the same. The task I want to delete is still in mesos ui. But I have to delete the task in the docker does not exist, I do not know why. The follow is my config info: ``` $ curl -sSL http://172.30.30.4:5050/version | python -m json.tool { "build_date": "2017-04-12 16:39:09", "build_time": 1492015149.0, "build_user": "centos", "git_sha": "de306b5786de3c221bae1457c6f2ccaeb38eef9f", "git_tag": "1.2.0", "version": "1.2.0" } $ curl -sSL http://172.30.30.4:8080/v2/info | python -m json.tool { ...... "name": "marathon", "version": "1.4.2", ...... } $ docker version Client: Version: 1.12.5 API version: 1.24 Go version: go1.6.4 Git commit: 7392c3b Built: Fri Dec 16 02:23:59 2016 OS/Arch: linux/amd64 Server: Version: 1.12.5 API version: 1.24 Go version: go1.6.4 Git commit: 7392c3b Built: Fri Dec 16 02:23:59 2016 OS/Arch: linux/amd64 $ docker info Containers: 2 Running: 1 Paused: 0 Stopped: 1 Images: 16 Server Version: 1.12.5 Storage Driver: devicemapper Pool Name: docker-253:0-403269431-pool Pool Blocksize: 65.54 kB Base Device Size: 10.74 GB Backing Filesystem: xfs Data file: /dev/loop0 Metadata file: /dev/loop1 Data Space Used: 4.915 GB Data Space Total: 107.4 GB Data Space Available: 94.26 GB Metadata Space Used: 7.115 MB Metadata Space Total: 2.147 GB Metadata Space Available: 2.14 GB Thin Pool Minimum Free Space: 10.74 GB Udev Sync Supported: true Deferred Removal Enabled: false Deferred Deletion Enabled: false Deferred Deleted Device Count: 0 Data loop file: /var/lib/docker/devicemapper/devicemapper/data WARNING: Usage of loopback devices is strongly discouraged for production use. Use `--storage-opt dm.thinpooldev` to specify a custom block storage device. Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata Library Version: 1.02.135-RHEL7 (2016-09-28) Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: calico null bridge host overlay Swarm: inactive Runtimes: runc Default Runtime: runc Security Options: seccomp Kernel Version: 3.10.0-514.2.2.el7.x86_64 Operating System: CentOS Linux 7 (Core) OSType: linux Architecture: x86_64 CPUs: 8 Total Memory: 15.51 GiB Name: slave1 ID: 4VRS:RPSC:ASAX:MTRJ:TOGC:6RFL:6RJS:J4MK:3VSN:JZO2:Q2FB:LJEW Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ Cluster Store: etcd://172.30.30.5:2379 Cluster Advertise: 172.30.30.5:2375 Insecure Registries: 172.30.30.8:80 127.0.0.0/8 $ /usr/sbin/mesos-master \ --zk=zk://172.30.30.4:2181,172.30.30.12:2181/mesos \ --port=5050 \ --log_dir=/var/log/mesos \ --cluster=cp_cluster \ --hostname=172.30.30.4 \ --quorum=1 \ --work_dir=/var/lib/mesos $ /usr/sbin/mesos-slave \ --master=zk://172.30.30.4:2181,172.30.30.12:2181/mesos \ --log_dir=/var/log/mesos \ --containerizers=docker,mesos \ --executor_registration_timeout=5mins \ --hostname=172.30.30.5 \ --ip=172.30.30.5 \ --isolation=filesystem/linux,docker/runtime,network/cni \ --network_cni_config_dir=/var/lib/mesos/cni/config \ --network_cni_plugins_dir=/var/lib/mesos/cni/plugins \ --work_dir=/var/lib/mesos $ curl -sSL http://172.30.30.12:5050/master/tasks | python -m json.tool | grep "\"id\"" "id": "monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef", "id": "monitor-tools_prometheus.c6bdfccc-3233-11e7-bd32-024222b481ef", "id": "monitor-tools_alertmanager.c3bfff0b-3233-11e7-bd32-024222b481ef", "id": "monitor-exporter_node-exporter.0e44c1c5-3232-11e7-bd32-024222b481ef", "id": "monitor-exporter_cadvisor.0e455e07-3232-11e7-bd32-024222b481ef", "id": "monitor-exporter_mesos-exporter.0e43fe73-3232-11e7-bd32-024222b481ef", "id": "syslog.e0645b43-3238-11e7-90c2-02423620ccdc", "id": "syslog.e72bbd15-3238-11e7-90c2-02423620ccdc", "id": "syslog.a0b162ff-316f-11e7-bd32-024222b481ef", "id": "mesos-dns.1286663d-30ae-11e7-bd32-024222b481ef", "id": "mesos-dns.12c20fae-30ae-11e7-bd32-024222b481ef", "id": "mesos-dns.1286181c-30ae-11e7-bd32-024222b481ef", # slave1 $ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES e05468eaedc2 quay.io/calico/node:v1.1.0 "start_runit" 2 weeks ago Up 4 hours calico-node # slave2 $ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 7fcbb3f2fa61 quay.io/calico/node:v1.1.0 "start_runit" 2 weeks ago Up 4 hours calico-node # slave3 $ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES c69eba5bc322 quay.io/calico/node:v1.1.0 "start_runit" 2 weeks ago Up 4 hours calico-node ``` `monitor-***` tasks are I want to delete, but these tasks do not exist in the docker (`docker ps -a` is not the task). The follow is `monitor-***`tasks log info(`stdout`): ``` ...... Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Re-registered docker executor on 172.30.30.6 Re-registered docker executor on 172.30.30.6 ``` Now it's `stderr`: ``` I0506 23:26:45.211597 17482 exec.cpp:162] Version: 1.2.0 I0506 23:26:45.224807 17489 exec.cpp:237] Executor registered on agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 I0506 23:26:45.310137 17486 docker.cpp:850] Running docker -H unix:///var/run/docker.sock run --cpu-shares 512 --memory 1073741824 --env-file /tmp/qT2brz -v /etc/localtime:/etc/localtime:ro -v /var/lib/mesos/slaves/e79deb05-5f20-48ca-9de9-a9610504e040-S1/frameworks/1e68ea0f-0f0b-4f14-8e2e-ab10169ee5f3-0000/executors/monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef/runs/7674214b-a609-47e6-a51a-7129616e8494:/mnt/mesos/sandbox --net calico1 --log-driver=syslog --log-opt=syslog-address=tcp://172.30.30.11:514 --log-opt=tag=grafana --label=MESOS_TASK_ID=monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef --name mesos-e79deb05-5f20-48ca-9de9-a9610504e040-S1.7674214b-a609-47e6-a51a-7129616e8494 grafana/grafana I0506 16:57:13.687083 17489 exec.cpp:488] Agent exited, but framework has checkpointing enabled. Waiting 15mins to reconnect with agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 I0506 16:57:14.359922 17486 exec.cpp:283] Received reconnect request from agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 I0506 16:57:14.366843 17488 exec.cpp:260] Executor re-registered on agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 I0506 17:45:40.419311 17485 exec.cpp:488] Agent exited, but framework has checkpointing enabled. Waiting 15mins to reconnect with agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 I0506 17:45:41.629977 17485 exec.cpp:283] Received reconnect request from agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 I0506 17:45:41.658910 17485 exec.cpp:260] Executor re-registered on agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 I0506 18:02:20.707939 17485 exec.cpp:488] Agent exited, but framework has checkpointing enabled. Waiting 15mins to reconnect with agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 I0506 18:03:43.669370 17485 exec.cpp:283] Received reconnect request from agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 I0506 18:03:43.698617 17485 exec.cpp:260] Executor re-registered on agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 ``` Now i want to know how to remove these tasks, and where are these tasks running? (I do not think run in the mesos container because I specified `"type": "DOCKER"`) There is a slave also thrown the following exception: ``` Message from syslogd@slave2 at May 6 08:27:35 ... kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1 Message from syslogd@slave2 at May 6 08:27:45 ... kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1 ```) Labels: (was: github-import) Description: I submitted an application group in marathon, and then I deleted the application group because there was some problem with the definition of the application group, and then I submitted the application group again via the marathon api. After multiple deletions and multiple submissions, I found some problems (this is the third time I encountered the problem), I deleted the application group in the marathon UI, found in the `deployments` there are some content, and can not be deleted, suggesting that `error destroying null: app '/ Null' does not exist `; in the mesos ui task status is `running`; docker container status is `up`. Then, I restart mesos-master, mesos-slave, marathon and zookeeper, the result is the same. I suggested that I clear the marathon in the zookeeper and then I did; marathon was back to normal, just like reinstalling it. But the tasks I want to delete are still in mesos ui and the state is `running`. After that, I cleared the meso in zookeeper, and the result is the same. The task I want to delete is still in mesos ui. But I have to delete the task in the docker does not exist, I do not know why. The follow is my config info: {noformat} $ curl -sSL http://172.30.30.4:5050/version | python -m json.tool { "build_date": "2017-04-12 16:39:09", "build_time": 1492015149.0, "build_user": "centos", "git_sha": "de306b5786de3c221bae1457c6f2ccaeb38eef9f", "git_tag": "1.2.0", "version": "1.2.0" } $ curl -sSL http://172.30.30.4:8080/v2/info | python -m json.tool { ...... "name": "marathon", "version": "1.4.2", ...... } $ docker version Client: Version: 1.12.5 API version: 1.24 Go version: go1.6.4 Git commit: 7392c3b Built: Fri Dec 16 02:23:59 2016 OS/Arch: linux/amd64 Server: Version: 1.12.5 API version: 1.24 Go version: go1.6.4 Git commit: 7392c3b Built: Fri Dec 16 02:23:59 2016 OS/Arch: linux/amd64 $ docker info Containers: 2 Running: 1 Paused: 0 Stopped: 1 Images: 16 Server Version: 1.12.5 Storage Driver: devicemapper Pool Name: docker-253:0-403269431-pool Pool Blocksize: 65.54 kB Base Device Size: 10.74 GB Backing Filesystem: xfs Data file: /dev/loop0 Metadata file: /dev/loop1 Data Space Used: 4.915 GB Data Space Total: 107.4 GB Data Space Available: 94.26 GB Metadata Space Used: 7.115 MB Metadata Space Total: 2.147 GB Metadata Space Available: 2.14 GB Thin Pool Minimum Free Space: 10.74 GB Udev Sync Supported: true Deferred Removal Enabled: false Deferred Deletion Enabled: false Deferred Deleted Device Count: 0 Data loop file: /var/lib/docker/devicemapper/devicemapper/data WARNING: Usage of loopback devices is strongly discouraged for production use. Use `--storage-opt dm.thinpooldev` to specify a custom block storage device. Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata Library Version: 1.02.135-RHEL7 (2016-09-28) Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: calico null bridge host overlay Swarm: inactive Runtimes: runc Default Runtime: runc Security Options: seccomp Kernel Version: 3.10.0-514.2.2.el7.x86_64 Operating System: CentOS Linux 7 (Core) OSType: linux Architecture: x86_64 CPUs: 8 Total Memory: 15.51 GiB Name: slave1 ID: 4VRS:RPSC:ASAX:MTRJ:TOGC:6RFL:6RJS:J4MK:3VSN:JZO2:Q2FB:LJEW Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ Cluster Store: etcd://172.30.30.5:2379 Cluster Advertise: 172.30.30.5:2375 Insecure Registries: 172.30.30.8:80 127.0.0.0/8 $ /usr/sbin/mesos-master \ --zk=zk://172.30.30.4:2181,172.30.30.12:2181/mesos \ --port=5050 \ --log_dir=/var/log/mesos \ --cluster=cp_cluster \ --hostname=172.30.30.4 \ --quorum=1 \ --work_dir=/var/lib/mesos $ /usr/sbin/mesos-slave \ --master=zk://172.30.30.4:2181,172.30.30.12:2181/mesos \ --log_dir=/var/log/mesos \ --containerizers=docker,mesos \ --executor_registration_timeout=5mins \ --hostname=172.30.30.5 \ --ip=172.30.30.5 \ --isolation=filesystem/linux,docker/runtime,network/cni \ --network_cni_config_dir=/var/lib/mesos/cni/config \ --network_cni_plugins_dir=/var/lib/mesos/cni/plugins \ --work_dir=/var/lib/mesos $ curl -sSL http://172.30.30.12:5050/master/tasks | python -m json.tool | grep "\"id\"" "id": "monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef", "id": "monitor-tools_prometheus.c6bdfccc-3233-11e7-bd32-024222b481ef", "id": "monitor-tools_alertmanager.c3bfff0b-3233-11e7-bd32-024222b481ef", "id": "monitor-exporter_node-exporter.0e44c1c5-3232-11e7-bd32-024222b481ef", "id": "monitor-exporter_cadvisor.0e455e07-3232-11e7-bd32-024222b481ef", "id": "monitor-exporter_mesos-exporter.0e43fe73-3232-11e7-bd32-024222b481ef", "id": "syslog.e0645b43-3238-11e7-90c2-02423620ccdc", "id": "syslog.e72bbd15-3238-11e7-90c2-02423620ccdc", "id": "syslog.a0b162ff-316f-11e7-bd32-024222b481ef", "id": "mesos-dns.1286663d-30ae-11e7-bd32-024222b481ef", "id": "mesos-dns.12c20fae-30ae-11e7-bd32-024222b481ef", "id": "mesos-dns.1286181c-30ae-11e7-bd32-024222b481ef", # slave1 $ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES e05468eaedc2 quay.io/calico/node:v1.1.0 "start_runit" 2 weeks ago Up 4 hours calico-node # slave2 $ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 7fcbb3f2fa61 quay.io/calico/node:v1.1.0 "start_runit" 2 weeks ago Up 4 hours calico-node # slave3 $ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES c69eba5bc322 quay.io/calico/node:v1.1.0 "start_runit" 2 weeks ago Up 4 hours calico-node {noformat} `monitor-***` tasks are I want to delete, but these tasks do not exist in the docker (`docker ps -a` is not the task). The follow is `monitor-***`tasks log info(`stdout`): {noformat} ...... Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef Re-registered docker executor on 172.30.30.6 Re-registered docker executor on 172.30.30.6 {noformat} Now it's `stderr`: {noformat} I0506 23:26:45.211597 17482 exec.cpp:162] Version: 1.2.0 I0506 23:26:45.224807 17489 exec.cpp:237] Executor registered on agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 I0506 23:26:45.310137 17486 docker.cpp:850] Running docker -H unix:///var/run/docker.sock run --cpu-shares 512 --memory 1073741824 --env-file /tmp/qT2brz -v /etc/localtime:/etc/localtime:ro -v /var/lib/mesos/slaves/e79deb05-5f20-48ca-9de9-a9610504e040-S1/frameworks/1e68ea0f-0f0b-4f14-8e2e-ab10169ee5f3-0000/executors/monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef/runs/7674214b-a609-47e6-a51a-7129616e8494:/mnt/mesos/sandbox --net calico1 --log-driver=syslog --log-opt=syslog-address=tcp://172.30.30.11:514 --log-opt=tag=grafana --label=MESOS_TASK_ID=monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef --name mesos-e79deb05-5f20-48ca-9de9-a9610504e040-S1.7674214b-a609-47e6-a51a-7129616e8494 grafana/grafana I0506 16:57:13.687083 17489 exec.cpp:488] Agent exited, but framework has checkpointing enabled. Waiting 15mins to reconnect with agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 I0506 16:57:14.359922 17486 exec.cpp:283] Received reconnect request from agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 I0506 16:57:14.366843 17488 exec.cpp:260] Executor re-registered on agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 I0506 17:45:40.419311 17485 exec.cpp:488] Agent exited, but framework has checkpointing enabled. Waiting 15mins to reconnect with agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 I0506 17:45:41.629977 17485 exec.cpp:283] Received reconnect request from agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 I0506 17:45:41.658910 17485 exec.cpp:260] Executor re-registered on agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 I0506 18:02:20.707939 17485 exec.cpp:488] Agent exited, but framework has checkpointing enabled. Waiting 15mins to reconnect with agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 I0506 18:03:43.669370 17485 exec.cpp:283] Received reconnect request from agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 I0506 18:03:43.698617 17485 exec.cpp:260] Executor re-registered on agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 {noformat} Now i want to know how to remove these tasks, and where are these tasks running? (I do not think run in the mesos container because I specified `"type": "DOCKER"`) There is a slave also thrown the following exception: {noformat} Message from syslogd@slave2 at May 6 08:27:35 ... kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1 Message from syslogd@slave2 at May 6 08:27:45 ... kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1 {noformat} > Mesos marathon and docker not synchronized > ------------------------------------------ > > Key: MESOS-7466 > URL: https://issues.apache.org/jira/browse/MESOS-7466 > Project: Mesos > Issue Type: Bug > Reporter: jasontom > Priority: Critical > > I submitted an application group in marathon, and then I deleted the application group because there was some problem with the definition of the application group, and then I submitted the application group again via the marathon api. After multiple deletions and multiple submissions, I found some problems (this is the third time I encountered the problem), I deleted the application group in the marathon UI, found in the `deployments` there are some content, and can not be deleted, suggesting that `error destroying null: app '/ Null' does not exist `; in the mesos ui task status is `running`; docker container status is `up`. Then, I restart mesos-master, mesos-slave, marathon and zookeeper, the result is the same. > I suggested that I clear the marathon in the zookeeper and then I did; marathon was back to normal, just like reinstalling it. But the tasks I want to delete are still in mesos ui and the state is `running`. > After that, I cleared the meso in zookeeper, and the result is the same. The task I want to delete is still in mesos ui. But I have to delete the task in the docker does not exist, I do not know why. > The follow is my config info: > {noformat} > $ curl -sSL http://172.30.30.4:5050/version | python -m json.tool > { > "build_date": "2017-04-12 16:39:09", > "build_time": 1492015149.0, > "build_user": "centos", > "git_sha": "de306b5786de3c221bae1457c6f2ccaeb38eef9f", > "git_tag": "1.2.0", > "version": "1.2.0" > } > $ curl -sSL http://172.30.30.4:8080/v2/info | python -m json.tool > { > ...... > "name": "marathon", > "version": "1.4.2", > ...... > } > $ docker version > Client: > Version: 1.12.5 > API version: 1.24 > Go version: go1.6.4 > Git commit: 7392c3b > Built: Fri Dec 16 02:23:59 2016 > OS/Arch: linux/amd64 > Server: > Version: 1.12.5 > API version: 1.24 > Go version: go1.6.4 > Git commit: 7392c3b > Built: Fri Dec 16 02:23:59 2016 > OS/Arch: linux/amd64 > $ docker info > Containers: 2 > Running: 1 > Paused: 0 > Stopped: 1 > Images: 16 > Server Version: 1.12.5 > Storage Driver: devicemapper > Pool Name: docker-253:0-403269431-pool > Pool Blocksize: 65.54 kB > Base Device Size: 10.74 GB > Backing Filesystem: xfs > Data file: /dev/loop0 > Metadata file: /dev/loop1 > Data Space Used: 4.915 GB > Data Space Total: 107.4 GB > Data Space Available: 94.26 GB > Metadata Space Used: 7.115 MB > Metadata Space Total: 2.147 GB > Metadata Space Available: 2.14 GB > Thin Pool Minimum Free Space: 10.74 GB > Udev Sync Supported: true > Deferred Removal Enabled: false > Deferred Deletion Enabled: false > Deferred Deleted Device Count: 0 > Data loop file: /var/lib/docker/devicemapper/devicemapper/data > WARNING: Usage of loopback devices is strongly discouraged for production use. Use `--storage-opt dm.thinpooldev` to specify a custom block storage device. > Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata > Library Version: 1.02.135-RHEL7 (2016-09-28) > Logging Driver: json-file > Cgroup Driver: cgroupfs > Plugins: > Volume: local > Network: calico null bridge host overlay > Swarm: inactive > Runtimes: runc > Default Runtime: runc > Security Options: seccomp > Kernel Version: 3.10.0-514.2.2.el7.x86_64 > Operating System: CentOS Linux 7 (Core) > OSType: linux > Architecture: x86_64 > CPUs: 8 > Total Memory: 15.51 GiB > Name: slave1 > ID: 4VRS:RPSC:ASAX:MTRJ:TOGC:6RFL:6RJS:J4MK:3VSN:JZO2:Q2FB:LJEW > Docker Root Dir: /var/lib/docker > Debug Mode (client): false > Debug Mode (server): false > Registry: https://index.docker.io/v1/ > Cluster Store: etcd://172.30.30.5:2379 > Cluster Advertise: 172.30.30.5:2375 > Insecure Registries: > 172.30.30.8:80 > 127.0.0.0/8 > $ /usr/sbin/mesos-master \ > --zk=zk://172.30.30.4:2181,172.30.30.12:2181/mesos \ > --port=5050 \ > --log_dir=/var/log/mesos \ > --cluster=cp_cluster \ > --hostname=172.30.30.4 \ > --quorum=1 \ > --work_dir=/var/lib/mesos > $ /usr/sbin/mesos-slave \ > --master=zk://172.30.30.4:2181,172.30.30.12:2181/mesos \ > --log_dir=/var/log/mesos \ > --containerizers=docker,mesos \ > --executor_registration_timeout=5mins \ > --hostname=172.30.30.5 \ > --ip=172.30.30.5 \ > --isolation=filesystem/linux,docker/runtime,network/cni \ > --network_cni_config_dir=/var/lib/mesos/cni/config \ > --network_cni_plugins_dir=/var/lib/mesos/cni/plugins \ > --work_dir=/var/lib/mesos > $ curl -sSL http://172.30.30.12:5050/master/tasks | python -m json.tool | grep "\"id\"" > "id": "monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef", > "id": "monitor-tools_prometheus.c6bdfccc-3233-11e7-bd32-024222b481ef", > "id": "monitor-tools_alertmanager.c3bfff0b-3233-11e7-bd32-024222b481ef", > "id": "monitor-exporter_node-exporter.0e44c1c5-3232-11e7-bd32-024222b481ef", > "id": "monitor-exporter_cadvisor.0e455e07-3232-11e7-bd32-024222b481ef", > "id": "monitor-exporter_mesos-exporter.0e43fe73-3232-11e7-bd32-024222b481ef", > "id": "syslog.e0645b43-3238-11e7-90c2-02423620ccdc", > "id": "syslog.e72bbd15-3238-11e7-90c2-02423620ccdc", > "id": "syslog.a0b162ff-316f-11e7-bd32-024222b481ef", > "id": "mesos-dns.1286663d-30ae-11e7-bd32-024222b481ef", > "id": "mesos-dns.12c20fae-30ae-11e7-bd32-024222b481ef", > "id": "mesos-dns.1286181c-30ae-11e7-bd32-024222b481ef", > # slave1 > $ docker ps -a > CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES > e05468eaedc2 quay.io/calico/node:v1.1.0 "start_runit" 2 weeks ago Up 4 hours calico-node > # slave2 > $ docker ps -a > CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES > 7fcbb3f2fa61 quay.io/calico/node:v1.1.0 "start_runit" 2 weeks ago Up 4 hours calico-node > # slave3 > $ docker ps -a > CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES > c69eba5bc322 quay.io/calico/node:v1.1.0 "start_runit" 2 weeks ago Up 4 hours calico-node > {noformat} > `monitor-***` tasks are I want to delete, but these tasks do not exist in the docker (`docker ps -a` is not the task). > The follow is `monitor-***`tasks log info(`stdout`): > {noformat} > ...... > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Received killTask for task monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef > Re-registered docker executor on 172.30.30.6 > Re-registered docker executor on 172.30.30.6 > {noformat} > Now it's `stderr`: > {noformat} > I0506 23:26:45.211597 17482 exec.cpp:162] Version: 1.2.0 > I0506 23:26:45.224807 17489 exec.cpp:237] Executor registered on agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 > I0506 23:26:45.310137 17486 docker.cpp:850] Running docker -H unix:///var/run/docker.sock run --cpu-shares 512 --memory 1073741824 --env-file /tmp/qT2brz -v /etc/localtime:/etc/localtime:ro -v /var/lib/mesos/slaves/e79deb05-5f20-48ca-9de9-a9610504e040-S1/frameworks/1e68ea0f-0f0b-4f14-8e2e-ab10169ee5f3-0000/executors/monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef/runs/7674214b-a609-47e6-a51a-7129616e8494:/mnt/mesos/sandbox --net calico1 --log-driver=syslog --log-opt=syslog-address=tcp://172.30.30.11:514 --log-opt=tag=grafana --label=MESOS_TASK_ID=monitor-tools_grafana.ea8672ed-3233-11e7-bd32-024222b481ef --name mesos-e79deb05-5f20-48ca-9de9-a9610504e040-S1.7674214b-a609-47e6-a51a-7129616e8494 grafana/grafana > I0506 16:57:13.687083 17489 exec.cpp:488] Agent exited, but framework has checkpointing enabled. Waiting 15mins to reconnect with agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 > I0506 16:57:14.359922 17486 exec.cpp:283] Received reconnect request from agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 > I0506 16:57:14.366843 17488 exec.cpp:260] Executor re-registered on agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 > I0506 17:45:40.419311 17485 exec.cpp:488] Agent exited, but framework has checkpointing enabled. Waiting 15mins to reconnect with agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 > I0506 17:45:41.629977 17485 exec.cpp:283] Received reconnect request from agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 > I0506 17:45:41.658910 17485 exec.cpp:260] Executor re-registered on agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 > I0506 18:02:20.707939 17485 exec.cpp:488] Agent exited, but framework has checkpointing enabled. Waiting 15mins to reconnect with agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 > I0506 18:03:43.669370 17485 exec.cpp:283] Received reconnect request from agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 > I0506 18:03:43.698617 17485 exec.cpp:260] Executor re-registered on agent e79deb05-5f20-48ca-9de9-a9610504e040-S1 > {noformat} > Now i want to know how to remove these tasks, and where are these tasks running? (I do not think run in the mesos container because I specified `"type": "DOCKER"`) > There is a slave also thrown the following exception: > {noformat} > Message from syslogd@slave2 at May 6 08:27:35 ... > kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1 > Message from syslogd@slave2 at May 6 08:27:45 ... > kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)