mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Mahler (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MESOS-7921) ProcessManager::resume sometimes crashes accessing EventQueue.
Date Sun, 22 Oct 2017 19:24:00 GMT

     [ https://issues.apache.org/jira/browse/MESOS-7921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Benjamin Mahler updated MESOS-7921:
-----------------------------------
    Fix Version/s: 1.4.1

> ProcessManager::resume sometimes crashes accessing EventQueue.
> --------------------------------------------------------------
>
>                 Key: MESOS-7921
>                 URL: https://issues.apache.org/jira/browse/MESOS-7921
>             Project: Mesos
>          Issue Type: Bug
>          Components: libprocess
>    Affects Versions: 1.4.0
>         Environment: autotools,gcc,--verbose,GLOG_v=1 MESOS_VERBOSE=1,ubuntu:14.04,(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)
> Note that --enable-lock-free-event-queue is not enabled.
> Details: https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/injectedEnvVars/
>            Reporter: Yan Xu
>            Assignee: Benjamin Mahler
>            Priority: Blocker
>             Fix For: 1.4.1, 1.5.0
>
>         Attachments: FetcherCacheTest.CachedCustomOutputFileWithSubdirectory.log.txt,
MesosContainerizerSlaveRecoveryTest.ResourceStatisticsFullLog.txt
>
>
> The following segfault is found on [ASF|https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/]
in {{MesosContainerizerSlaveRecoveryTest.ResourceStatistics}} but it's flaky and shows up
in other tests and environments (with or without --enable-lock-free-event-queue) as well.
> {noformat: title=Configuration}
> ./bootstrap '&&' ./configure --verbose '&&' make -j6 distcheck
> {noformat}
> {noformat:title=}
> *** Aborted at 1503937885 (unix time) try "date -d @1503937885" if you are using GNU
date ***
> PC: @     0x2b9e2581caa0 process::EventQueue::Consumer::empty()
> *** SIGSEGV (@0x8) received by PID 751 (TID 0x2b9e31978700) from PID 8; stack trace:
***
>     @     0x2b9e29d26330 (unknown)
>     @     0x2b9e2581caa0 process::EventQueue::Consumer::empty()
>     @     0x2b9e25800a40 process::ProcessManager::resume()
>     @     0x2b9e2580f891 process::ProcessManager::init_threads()::$_9::operator()()
>     @     0x2b9e2580f7d5 _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvE3$_9vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE
>     @     0x2b9e2580f7a5 std::_Bind_simple<>::operator()()
>     @     0x2b9e2580f77c std::thread::_Impl<>::_M_run()
>     @     0x2b9e29fe5a60 (unknown)
>     @     0x2b9e29d1e184 start_thread
>     @     0x2b9e2a851ffd (unknown)
> make[3]: *** [CMakeFiles/check] Segmentation fault (core dumped)
> {noformat}
> A builds@mesos.apache.org query shows many such instances: https://lists.apache.org/list.html?builds@mesos.apache.org:lte=1M:process%3A%3AEventQueue%3A%3AConsumer%3A%3Aempty



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message