brooklyn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aled Sage (JIRA)" <j...@apache.org>
Subject [jira] [Created] (BROOKLYN-484) JBoss7 entity restart fails (launch ssh session returns before process running?)
Date Thu, 20 Apr 2017 16:48:04 GMT
Aled Sage created BROOKLYN-484:
----------------------------------

             Summary: JBoss7 entity restart fails (launch ssh session returns before process
running?)
                 Key: BROOKLYN-484
                 URL: https://issues.apache.org/jira/browse/BROOKLYN-484
             Project: Brooklyn
          Issue Type: Bug
            Reporter: Aled Sage
            Priority: Minor


With version 0.11.0-rc1...

We've seen a failure of the {{restart}} effector for {{JBoss7Server}}. The post-launch step
failed (waiting for the url to be reachable/responsive).

Unfortunately there's no additional debugging information available - the VMs are gone, and
the debug log is not available.

However, I've identified a reason why this might happen.

On {{start}}, the {{JBoss7SshDriver.launch}} script will redirect stdout/stderr to a file
named {{console}}, and will then wait for that file to say 'starting'.

Importantly, there is an old comment saying:
{noformat}
        // We wait for evidence of JBoss running because, using SshCliTool,
        // we saw the ssh session return before the JBoss process was fully running
        // so the process failed to start.
{noformat}

On {{restart}}, it stops the process, and then calls {{JBoss7SshDriver.launch}} again. However,
it appends to the file {{console}}. Therefore when it checks if the file says 'starting' it
will return immediately. This means the ssh session could return before the JBoss process
was fully running.

A solution would be to change launch, to first move the previous {{console}} file. This would
mean the subsequent calls to the {{launch}} script would wait for the process to be running.

This same problem would also apply to other entities, such as {{TomcatSshDriver.launch}}.

A way to reproduce this would probably be to repeatedly call the {{restart}} effector (waiting
for serviceUp to be true again between each). It almost always works - I've personally only
seen this failure once.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message