mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James DeFelice (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MESOS-2865) intermittently the executor is not receiving TASK_KILLED
Date Sat, 22 Aug 2015 07:50:45 GMT

    [ https://issues.apache.org/jira/browse/MESOS-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14707935#comment-14707935
] 

James DeFelice commented on MESOS-2865:
---------------------------------------

Interestingly enough I was still able to reproduce the problem after a while with the work-around
code!
Some additional debugging revealed that the HTTP connection was no longer fully open; go had
buffered the all of the data but when the http lib tries to write a 202 response the write()
failed on a pipe error (already closed). normally the http lib would abort and stop reading
data from the request. When I added some code to ignore the failure on write-response-header,
I was able to continue fully processing the buffered request backlog.

I'm having a hard time believing that mesos doesn't wait for a HTTP 202 for each pipelined
request it generates - can this be true?

> intermittently the executor is not receiving TASK_KILLED
> --------------------------------------------------------
>
>                 Key: MESOS-2865
>                 URL: https://issues.apache.org/jira/browse/MESOS-2865
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 0.21.1, 0.23.0
>         Environment: {code}
> $ dpkg -l |grep -e mesos
> ii  mesos                               0.21.1-1.1.ubuntu1404            amd64      
 Cluster resource manager with efficient resource isolation
> $ uname -a
> Linux node-1 3.13.0-29-generic #53-Ubuntu SMP Wed Jun 4 21:00:20 UTC 2014 x86_64 x86_64
x86_64 GNU/Linux
> {code}
>            Reporter: James DeFelice
>              Labels: mesosphere
>
> for details, log snippets see https://github.com/mesosphere/kubernetes-mesos/issues/328
> The slave logs that it's been asked to kill a pod, but the message is never logged as
received by the executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message