Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C17F7200D25 for ; Sun, 22 Oct 2017 21:24:04 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id BFF0C160BD7; Sun, 22 Oct 2017 19:24:04 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 125301609E1 for ; Sun, 22 Oct 2017 21:24:03 +0200 (CEST) Received: (qmail 4601 invoked by uid 500); 22 Oct 2017 19:24:03 -0000 Mailing-List: contact issues-help@mesos.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mesos.apache.org Delivered-To: mailing list issues@mesos.apache.org Received: (qmail 4592 invoked by uid 99); 22 Oct 2017 19:24:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 22 Oct 2017 19:24:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 55C4D1A0897 for ; Sun, 22 Oct 2017 19:24:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 1cqzg5pDf_pG for ; Sun, 22 Oct 2017 19:24:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id E84985FD67 for ; Sun, 22 Oct 2017 19:24:00 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 66A99E002C for ; Sun, 22 Oct 2017 19:24:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 1F20121EE5 for ; Sun, 22 Oct 2017 19:24:00 +0000 (UTC) Date: Sun, 22 Oct 2017 19:24:00 +0000 (UTC) From: "Benjamin Mahler (JIRA)" To: issues@mesos.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (MESOS-7921) ProcessManager::resume sometimes crashes accessing EventQueue. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Sun, 22 Oct 2017 19:24:04 -0000 [ https://issues.apache.org/jira/browse/MESOS-7921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-7921: ----------------------------------- Fix Version/s: 1.4.1 > ProcessManager::resume sometimes crashes accessing EventQueue. > -------------------------------------------------------------- > > Key: MESOS-7921 > URL: https://issues.apache.org/jira/browse/MESOS-7921 > Project: Mesos > Issue Type: Bug > Components: libprocess > Affects Versions: 1.4.0 > Environment: autotools,gcc,--verbose,GLOG_v=1 MESOS_VERBOSE=1,ubuntu:14.04,(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2) > Note that --enable-lock-free-event-queue is not enabled. > Details: https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/injectedEnvVars/ > Reporter: Yan Xu > Assignee: Benjamin Mahler > Priority: Blocker > Fix For: 1.4.1, 1.5.0 > > Attachments: FetcherCacheTest.CachedCustomOutputFileWithSubdirectory.log.txt, MesosContainerizerSlaveRecoveryTest.ResourceStatisticsFullLog.txt > > > The following segfault is found on [ASF|https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/] in {{MesosContainerizerSlaveRecoveryTest.ResourceStatistics}} but it's flaky and shows up in other tests and environments (with or without --enable-lock-free-event-queue) as well. > {noformat: title=Configuration} > ./bootstrap '&&' ./configure --verbose '&&' make -j6 distcheck > {noformat} > {noformat:title=} > *** Aborted at 1503937885 (unix time) try "date -d @1503937885" if you are using GNU date *** > PC: @ 0x2b9e2581caa0 process::EventQueue::Consumer::empty() > *** SIGSEGV (@0x8) received by PID 751 (TID 0x2b9e31978700) from PID 8; stack trace: *** > @ 0x2b9e29d26330 (unknown) > @ 0x2b9e2581caa0 process::EventQueue::Consumer::empty() > @ 0x2b9e25800a40 process::ProcessManager::resume() > @ 0x2b9e2580f891 process::ProcessManager::init_threads()::$_9::operator()() > @ 0x2b9e2580f7d5 _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvE3$_9vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE > @ 0x2b9e2580f7a5 std::_Bind_simple<>::operator()() > @ 0x2b9e2580f77c std::thread::_Impl<>::_M_run() > @ 0x2b9e29fe5a60 (unknown) > @ 0x2b9e29d1e184 start_thread > @ 0x2b9e2a851ffd (unknown) > make[3]: *** [CMakeFiles/check] Segmentation fault (core dumped) > {noformat} > A builds@mesos.apache.org query shows many such instances: https://lists.apache.org/list.html?builds@mesos.apache.org:lte=1M:process%3A%3AEventQueue%3A%3AConsumer%3A%3Aempty -- This message was sent by Atlassian JIRA (v6.4.14#64029)