mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Greg Mann (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MESOS-3264) JVM can exit prematurely following framework teardown
Date Tue, 18 Aug 2015 22:49:46 GMT

    [ https://issues.apache.org/jira/browse/MESOS-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14702133#comment-14702133
] 

Greg Mann commented on MESOS-3264:
----------------------------------

[~haosdent@gmail.com], I ran that snippet in the JVM and it produced that output exactly.
Maybe it's not because the order of shutdown hooks is unspecified, but rather since the shutdown
hooks are run concurrently, maybe the JVM continues on to other native object destructors
after it has run {{SchedulerDriver.finalize()}}, but while the method is still running. I'm
not sure where the destruction of native file-scope objects fits into this shutdown scheme,
and one of the mutexes causing trouble exist at file scope in {{glog-0.3.3/src/logging.cc}}.

> JVM can exit prematurely following framework teardown
> -----------------------------------------------------
>
>                 Key: MESOS-3264
>                 URL: https://issues.apache.org/jira/browse/MESOS-3264
>             Project: Mesos
>          Issue Type: Bug
>          Components: java api
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Greg Mann
>            Priority: Minor
>              Labels: java, tech-debt
>
> In Java frameworks, it is possible for the JVM to begin exiting the program - via {{System.exit()}},
for example - while teardown of native objects such as the SchedulerDriver and associated
Executors is still in progress. {{SchedulerDriver::stop()}} will return after it has sent
messages to other actors to begin their teardown, meanwhile the JVM is free to terminate the
program and thus begin executing native object destructors while those objects are still in
use, potentially leading to a segfault.
> This has manifested itself in flaky tests from the ExamplesTest suite (see MESOS-830
and MESOS-1013), as mutexes from glog are destroyed while the framework is still shutting
down and attempting to log.
> Ideally, a mechanism would exist to block the Java code until a confirmation that framework
teardown is complete has been received.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message