mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Deshi Xiao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MESOS-7210) MESOS HTTP checks doesn't work when mesos runs with --docker_mesos_image ( pid namespace mismatch )
Date Tue, 04 Apr 2017 20:20:42 GMT

    [ https://issues.apache.org/jira/browse/MESOS-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15955733#comment-15955733
] 

Deshi Xiao commented on MESOS-7210:
-----------------------------------

try 
dockerInfo.parameters.push_back("--pid=host");

does it correct?

> MESOS HTTP checks doesn't work when mesos runs with --docker_mesos_image ( pid namespace
mismatch )
> ---------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-7210
>                 URL: https://issues.apache.org/jira/browse/MESOS-7210
>             Project: Mesos
>          Issue Type: Bug
>          Components: docker
>    Affects Versions: 1.1.0, 1.1.1, 1.2.0
>         Environment: Ubuntu 16.04.02
> Docker version 1.13.1
> mesos 1.1.0, runs from container
> docker containers  spawned by marathon 1.4.1
>            Reporter: Wojciech Sielski
>            Assignee: haosdent
>            Priority: Critical
>
> When running mesos-slave with option "docker_mesos_image" like:
> {code}
> --master=zk://standalone:2181/mesos  --containerizers=docker,mesos  --executor_registration_timeout=5mins
 --hostname=standalone  --ip=0.0.0.0  --docker_stop_timeout=5secs  --gc_delay=1days  --docker_socket=/var/run/docker.sock
 --no-systemd_enable_support  --work_dir=/tmp/mesos  --docker_mesos_image=panteras/paas-in-a-box:0.4.0
> {code}
> from the container that was started with option "pid: host" like:
> {code}
>   net:        host
>   privileged: true
>   pid:        host
> {code}
> and example marathon job, that use MESOS_HTTP checks like:
> {code}
> {
>  "id": "python-example-stable",
>  "cmd": "python3 -m http.server 8080",
>  "mem": 16,
>  "cpus": 0.1,
>  "instances": 2,
>  "container": {
>    "type": "DOCKER",
>    "docker": {
>      "image": "python:alpine",
>      "network": "BRIDGE",
>      "portMappings": [
>         { "containerPort": 8080, "hostPort": 0, "protocol": "tcp" }
>      ]
>    }
>  },
>  "env": {
>    "SERVICE_NAME" : "python"
>  },
>  "healthChecks": [
>    {
>      "path": "/",
>      "portIndex": 0,
>      "protocol": "MESOS_HTTP",
>      "gracePeriodSeconds": 30,
>      "intervalSeconds": 10,
>      "timeoutSeconds": 30,
>      "maxConsecutiveFailures": 3
>    }
>  ]
> }
> {code}
> I see the errors like:
> {code}
> F0306 07:41:58.844293    35 health_checker.cpp:94] Failed to enter the net namespace
of task (pid: '13527'): Pid 13527 does not exist
> *** Check failure stack trace: ***
>     @     0x7f51770b0c1d  google::LogMessage::Fail()
>     @     0x7f51770b29d0  google::LogMessage::SendToLog()
>     @     0x7f51770b0803  google::LogMessage::Flush()
>     @     0x7f51770b33f9  google::LogMessageFatal::~LogMessageFatal()
>     @     0x7f517647ce46  _ZNSt17_Function_handlerIFivEZN5mesos8internal6health14cloneWithSetnsERKSt8functionIS0_E6OptionIiERKSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaISG_EEEUlvE_E9_M_invokeERKSt9_Any_data
>     @     0x7f517647bf2b  mesos::internal::health::cloneWithSetns()
>     @     0x7f517648374b  std::_Function_handler<>::_M_invoke()
>     @     0x7f5177068167  process::internal::cloneChild()
>     @     0x7f5177065c32  process::subprocess()
>     @     0x7f5176481a9d  mesos::internal::health::HealthCheckerProcess::_httpHealthCheck()
>     @     0x7f51764831f7  mesos::internal::health::HealthCheckerProcess::_healthCheck()
>     @     0x7f517701f38c  process::ProcessBase::visit()
>     @     0x7f517702c8b3  process::ProcessManager::resume()
>     @     0x7f517702fb77  _ZNSt6thread5_ImplISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEEE6_M_runEv
>     @     0x7f51754ddc80  (unknown)
>     @     0x7f5174cf06ba  start_thread
>     @     0x7f5174a2682d  (unknown)
> I0306 07:41:59.077986     9 health_checker.cpp:199] Ignoring failure as health check
still in grace period
> {code}
> Looks like option docker_mesos_image makes, that newly started mesos job is not using
"pid host" option same as mother container was started, but has his own PID namespace (so
it doesn't matter if mother container was started with "pid host" or not it will never be
able to find PID)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message