karaf-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KARAF-5315) Race condition during shutdown using SIGTERM
Date Mon, 18 Sep 2017 09:00:17 GMT

    [ https://issues.apache.org/jira/browse/KARAF-5315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16169772#comment-16169772

ASF GitHub Bot commented on KARAF-5315:

GitHub user jbonofre opened a pull request:


    [KARAF-5315] Synchronize lock methods


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jbonofre/karaf KARAF-5315

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #374
commit 9e9a97c5a46b84d239bfbe6841ed6efe37ea8993
Author: Jean-Baptiste Onofré <jbonofre@apache.org>
Date:   2017-09-18T08:51:37Z

    [KARAF-5315] Synchronize lock methods


> Race condition during shutdown using SIGTERM
> --------------------------------------------
>                 Key: KARAF-5315
>                 URL: https://issues.apache.org/jira/browse/KARAF-5315
>             Project: Karaf
>          Issue Type: Bug
>    Affects Versions: 4.0.9
>         Environment: Linux using systemd
>            Reporter: Martin Krüger
>            Assignee: Jean-Baptiste Onofré
>             Fix For: 4.2.0, 4.0.10, 4.1.3
>         Attachments: 0001-KARAF-5315-Synchronize-access-to-SimpleFileLock.patch, 0002-KARAF-5315-Signal-handler-stops-framework-directly.patch
> During shutdown using SIGTERM there is a race condition.
> {noformat}
> Error occurred shutting down framework: java.nio.channels.ClosedChannelException
> java.nio.channels.ClosedChannelException
>          at sun.nio.ch.FileLockImpl.release(FileLockImpl.java:58)
>          at org.apache.karaf.main.lock.SimpleFileLock.release(SimpleFileLock.java:78)
>          at org.apache.karaf.main.Main.destroy(Main.java:642)
>          at org.apache.karaf.main.Main.main(Main.java:188)
> Main process exited, code=exited, status=254
> {noformat}
> There are several problems in the code of the Main class.
> # The variable indicating the exit condition ( private boolean exiting; line 89) used
in several threads is not volatile.
> # The same is true for the lock (private Lock lock; line 87).
> # The signal handler calls Main.this.destroy(); which is called by the main thread again
after leaving function awaitShutdown() (line 581)
> Because the destroy() function releases the lock in the finally block the lock is released
twice. The used implementation is the SimpleFileLock. In there the release() function is not
synchronized. Since the channel of the file-lock is closed the second call will result in
the exception.
> To get rid of the double release the SimpleFileLock.release() function should be synchronized.
But I am not sure if the double call of the Main.destroy() function is an even bigger problem
because all activators are stopped twice too.
> {code}
>             while (timeout > 0) {
>                 timeout -= step;
>                 FrameworkEvent event = framework.waitForStop(step);
>                 if (event.getType() != FrameworkEvent.WAIT_TIMEDOUT) {
>                     activatorManager.stopKarafActivators();
>                     return true;
>                 }
>             }
> {code}
> Maybe synchronizing the Main.destroy() function is a good idea too or find a different
way to have the signal handler stopping the framework by just signaling the stop and waking
up the main thread ....

This message was sent by Atlassian JIRA

View raw message