mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Mahler (JIRA)" <>
Subject [jira] [Commented] (MESOS-8729) Libprocess: deadlock in process::finalize
Date Tue, 27 Mar 2018 01:42:00 GMT


Benjamin Mahler commented on MESOS-8729:

Looking at the last stack:
{color:#000000}#8 0x00007f09d2ac1aac in synchronize<std::recursive_mutex> () at ../../3rdparty/stout/include/stout/synchronized.hpp:58
#9 0x00007f09d492c37b in process::ProcessManager::use () at ../../../3rdparty/libprocess/src/process.cpp:2520
#10 0x00007f09d492e955 in process::ProcessManager::deliver () at ../../../3rdparty/libprocess/src/process.cpp:2775
// Trying to get a reference but blocked on the lock.{color}
#66 0x00007f09d492e988 in process::ProcessManager::deliver () at [../../../3rdparty/libprocess/src/process.cpp:2776
XXX Holds a reference!
This thread is doing a deliver (while holding a reference) and synchronously calls back into
deliver and blocks on the lock while holding a reference. The first thread is therefore stuck
spinning under the lock and the reference will never be released.
{color:#000000}I understand the issue now but haven't thought through a fix.{color}

> Libprocess: deadlock in process::finalize
> -----------------------------------------
>                 Key: MESOS-8729
>                 URL:
>             Project: Mesos
>          Issue Type: Bug
>          Components: libprocess
>    Affects Versions: 1.6.0
>         Environment: The issue has been reproduced on Ubuntu 16.04, master branch, commit
>            Reporter: Andrei Budnik
>            Priority: Major
>              Labels: deadlock, libprocess
>         Attachments: deadlock.txt
> Since we are calling [`libprocess::finalize()`|]
before returning from the IOSwitchboard's main function, we expect that all http responses
are going to be sent back to clients before IOSwitchboard terminates. However, after [adding|]
`libprocess::finalize()` we have seen that IOSwitchboard might get stuck in `libprocess::finalize()`.
See attached stacktrace.

This message was sent by Atlassian JIRA

View raw message