aurora-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bill Farner (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AURORA-1388) If mesos_slave gets a SIGUSR1, thermos doesn't shutdown cleanly
Date Thu, 09 Jul 2015 17:12:06 GMT

    [ https://issues.apache.org/jira/browse/AURORA-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14620863#comment-14620863
] 

Bill Farner commented on AURORA-1388:
-------------------------------------

Relevant - you should consider using the maintenance commands in {{aurora_admin}} if you are
doing things like fleet-wide maintenance.  This should safely drain hosts in a way that minimizes
churn.  We should fix this bug regardless, however.

> If mesos_slave gets a SIGUSR1, thermos doesn't shutdown cleanly
> ---------------------------------------------------------------
>
>                 Key: AURORA-1388
>                 URL: https://issues.apache.org/jira/browse/AURORA-1388
>             Project: Aurora
>          Issue Type: Bug
>            Reporter: Brian Brazil
>
> https://issues.apache.org/jira/browse/MESOS-1475 allows for a SIGUSR1 to be sent to a
mesos slave in order to shut it down and any processes cleanly, useful for changing slave
attributes.
> I tried this with my aurora setup, and via tcpdump found that it sent the first {{/shutdown}}
http request to the task - but nothing after it. The process also kept on running, holding
onto a static port in my case that prevented things from working when a task is scheduled
on that slave when it comes back up.
> We should ensure that thermos behaves correctly when the mesos slave gets a SIGUSR1,
following the lifecycle policy and ultimately killing the processes if needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message