mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph Wu <jos...@mesosphere.io>
Subject Re: What will happen in maintenance mode
Date Mon, 18 Jul 2016 18:17:51 GMT
My guess is that your agents don't match the machines you specified.  Note:
The maintenance endpoints in Mesos allow you to specify maintenance against
non-existent machines, because the operator may add agents on those
machines in future.

In Mesos' maintenance primitives, a "machine" is a hostname + IP.  (A
physical/virtual machine can hold multiple agents.)  The response in
/maintenance/status is in terms of machines, not agents.  If none of your
frameworks support inverse offers, then you won't get any useful
information from the /maintenance/status endpoint.

You can figure out an agent's hostname/IP by hitting the /master/slaves
endpoint:

{
  "slaves": [
    {
      "pid":"slave(1)@127.0.0.1:5051",
      "hostname":"foo-bar",
      ...

^ The above translates to a machine = { "hostname": "foo-bar", "ip" : "
127.0.0.1" }

On Mon, Jul 18, 2016 at 2:08 AM, Qiang Chen <qzschen@gmail.com> wrote:

> Hi all,
>
> I'm puzzled in using maintenance mode.
>
> I see this from mesos [doc site](
> http://mesos.apache.org/documentation/latest/maintenance/):
>
> ```
> When maintenance is triggered by the operator, all agents on the machine
> are told to shutdown. These agents are removed from the master, which means
> that a TASK_LOST status update will be sent for every task running on
> each of those agents. The scheduler driver’s slaveLost callback will also
> be invoked for each of the removed agents. Any agents on machines in
> maintenance are also prevented from re-registering with the master in the
> future (until maintenance is completed and the machine is brought back up).
> ```
> But I didn't find the agent machine shutdown or task failed when I test
> the maintenance HTTP endpoints.
>
> If mesos agents are in that mode will move the running tasks to other
> agents? namely, it will evacuate all the tasks in those agents? and the
> shutdown?
>
> When I POST "/maintenance/schedule" and "/machine/down" and give a proper
> maintain time window. I got the response that those specified agents are in
> the "draining_machines" and "down_machines" list by GET
> "/maintenance/status", but didn't shutdown and evacuate any tasks, why ?
> does it make sense?
>
> Thanks.
>
> --
> Best Regards,
> Chen, Qiang
>
>

Mime
View raw message