brooklyn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (BROOKLYN-264) Stop app while VM still being provisioned: vm is left running when app is expunged
Date Tue, 19 Jul 2016 10:35:21 GMT


ASF GitHub Bot commented on BROOKLYN-264:

Github user aledsage commented on a diff in the pull request:
    --- Diff: software/base/src/main/java/org/apache/brooklyn/entity/software/base/lifecycle/
    @@ -379,20 +418,30 @@ public MachineLocation call() throws Exception {
                 if (!(location instanceof LocalhostMachineProvisioningLocation))
           "Starting {}, obtaining a new location instance in {} with ports
{}", new Object[]{entity(), location, flags.get("inboundPorts")});
                 entity().sensors().set(SoftwareProcess.PROVISIONING_LOCATION, location);
    +            Transition expectedState = entity().sensors().get(Attributes.SERVICE_STATE_EXPECTED);
    +            // BROOKLYN-263: see corresponding code in doStop()
    +            if (expectedState != null && (expectedState.getState() == Lifecycle.STOPPING
|| expectedState.getState() == Lifecycle.STOPPED)) {
    +                throw new IllegalStateException("Provisioning aborted before even begun
for "+entity()+" in "+location+" (presumably by a concurrent call to stop");
    +            }
    +            entity().sensors().set(PROVISIONING_TASK_STATE, ProvisioningTaskState.RUNNING);
                 MachineLocation machine;
                 try {
                     machine = Tasks.withBlockingDetails("Provisioning machine in " + location,
new ObtainLocationTask(location, flags));
    -                if (machine == null)
    -                    throw new NoMachinesAvailableException("Failed to obtain machine
in " + location.toString());
    -            } catch (Exception e) {
    -                throw Exceptions.propagate(e);
    +                entity().sensors().set(PROVISIONED_MACHINE, machine);
    --- End diff --
    I'm willing to take that risk - the description says "internal" etc. The problem is that
otherwise there is a big gap between when provisioning returns and when we add the machine
to the location. I worry that start() will not always go through all those steps if there
has been a concurrent call to stop. It might skip executing those subsequent tasks. With the
change I made, we set DONE and the machine in a finally block so it removes that risk.

> Stop app while VM still being provisioned: vm is left running when app is expunged
> ----------------------------------------------------------------------------------
>                 Key: BROOKLYN-264
>                 URL:
>             Project: Brooklyn
>          Issue Type: Bug
>    Affects Versions: 0.9.0
>            Reporter: Aled Sage
> A customer deployed an app to AWS, but while the VM was still starting up they stopped
(and thus expunged) the app. The app disappeared from the Brooklyn web-console, but the starting
VM was left behind in AWS.
> This is simple to reproduce:
> 1. deploy a simple blueprint, such as:
> {noformat}
> location: aws-ec2:us-east-1
> services:
> - type: org.apache.brooklyn.entity.machine.MachineEntity
> {noformat}
> 2. wait for the VM to appear in the AWS web-console (with state "initialising")
> 3. call the {{stop}} effector on the top-level app.
> ---
> Looking at the {{start}} task that was executing at the time when {{stop}} was called,
below is the thread's stack trace:
> {noformat}
> Provisioning machine in JcloudsLocation[AWS Virginia:AAAAAAAAAAAAAAAAAAAA/aws-ec2:us-east-1@eyNrLIo5]
> Task[provisioning (AWS Virginia)]@MJITkjw0
> Submitted by SoftlyPresent[value=Task[start]@tKw0qJET]
> In progress, thread waiting (notify) on java.util.concurrent.CountDownLatch$Sync@2ed5be36
> At: org.jclouds.concurrent.FutureIterables.awaitCompletion(
>     org.jclouds.compute.internal.BaseComputeService.createNodesInGroup(
>     org.jclouds.ec2.compute.EC2ComputeService.createNodesInGroup(
>     org.apache.brooklyn.location.jclouds.JcloudsLocation.obtainOnce(
>     org.apache.brooklyn.location.jclouds.JcloudsLocation.obtain(
>     org.apache.brooklyn.util.core.task.Tasks.withBlockingDetails(
>     org.apache.brooklyn.util.core.task.DynamicSequentialTask$
>     org.apache.brooklyn.util.core.task.BasicExecutionManager$
> {noformat}
> From this, we can see that we are still calling jclouds. This means that jclouds has
not yet returned to Brooklyn the VM's id. It also means that the {{MachineEntity}} will not
have been given a {{JcloudsSshMachineLocation}} instance. 
> When {{stop}} is called on the {{MachineEntity}}, it doesn't have a machine location
instance so it doesn't have anything to ask to stop. This is why the VM is left running.

This message was sent by Atlassian JIRA

View raw message