stratos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lakmal Warusawithana <lak...@wso2.com>
Subject Re: Stratos not properly terminating VMs to fail to startup
Date Fri, 03 Apr 2015 01:54:04 GMT
IMO we should move this into next release since this is not a blocker. WDYT?

On Friday, April 3, 2015, Imesh Gunaratne <imesh@apache.org> wrote:

> By any chance do we know why above instances are going to Error state?
>
> Thanks
>
> On Friday, April 3, 2015, Vanson Lim <vlim@cisco.com
> <javascript:_e(%7B%7D,'cvml','vlim@cisco.com');>> wrote:
>
>>  On 4/2/15, 1:09 PM, Jeffrey Nguyen (jeffrngu) wrote:
>>
>>  Hi Anuruddha,
>>
>>  The instances that are in Error state on Openstack Horizon were never
>> removed even after Stratos successfully spawned an instance.   It sounds
>> like you might need to enhance jClouds API to return an object with nodeID
>> info for this case.  Or perhaps a better solution would be to modify the
>> jClouds API to delete the failed instance if it wasn’t spawned successfully
>> (or make that an option of the API that handles spawning new instance).
>>
>>   Jeffrey,
>>
>> If jcloud is not returning an nodeID, it should handle cleaning up. It's
>> also reasonable for jcloud to return an object to the failed instance but
>> that's not much use to stratos except for leaving the VM around so that we
>> can see that it failed to come up.      I don't know if we want an option
>> to have jcloud try to respawn an instances as this would most likely fail.
>> It's better to have stratos manage the retries.
>>
>> -Vanson.
>>
>>   Regards,
>> -Jeffrey
>>
>>   From: Anuruddha Liyanarachchi <anuruddhal@wso2.com>
>> Reply-To: "dev@stratos.apache.org" <dev@stratos.apache.org>
>> Date: Thursday, April 2, 2015 at 4:43 AM
>> To: "dev@stratos.apache.org" <dev@stratos.apache.org>
>> Subject: Re: Stratos not properly terminating VMs to fail to startup
>>
>>   Hi Vanson / Jeffery,
>>
>>  As seen in logs, the instance Id is not returned to Stratos
>> (instanceId=null) for the members which went to error state.Therefore
>> Stratos don't have control over the instances in the error state. Hence
>> spawned instances with errors are not being deleted.
>>
>>
>>
>> On Wed, Apr 1, 2015 at 4:08 AM, Jeffrey Nguyen (jeffrngu) <
>> jeffrngu@cisco.com> wrote:
>>
>>> Hi Vanson,
>>>
>>> I opened a JIRA to track this issue last week:
>>> https://issues.apache.org/jira/browse/STRATOS-1293
>>>
>>> -Jeffrey
>>>
>>> On 3/31/15, 3:04 PM, "Vanson Lim (vlim)" <vlim@cisco.com> wrote:
>>>
>>> >Devs,
>>> >
>>> >I've simulated the case where openstack fails to bring up a VM (we've
>>> >seen this before in cases where required resources are not available
>>> >or there is some IAAS problem/timeout which caused the VM to failure to
>>> >launch), in this case we cause this failure by specifying the
>>> >cartridge to have a fixed ip address is not part of the network which
>>> the
>>> >VM attaches to.  The network is defined with a 10.0.0.0/24
>>> >subnet, but I've specified a fixed ip=10.0.8.1 for cause the VM startup
>>> >to fail.
>>> >
>>> >The VM start fails and the VM remains in an error state the of the
>>> >"pendingMemberExpiryTimeout" period set in the autoscaler.xml file.
>>> >
>>> >Stratos fails to delete the VM in error state and attempts to start a
>>> new
>>> >VM, which also fails to launch.
>>> >
>>> >This presumably repeat itself creating an additional VM in error state
>>> >during each iteration until we've exhausted all the resources in the
>>> >system.
>>> >
>>> >wso2carbon.log and cartridge definition attached.
>>> >
>>> >-Vanson
>>> >
>>>
>>>
>>
>>
>>  --
>>   *Thanks and Regards,*
>> Anuruddha Lanka Liyanarachchi
>> Software Engineer - WSO2
>> Mobile : +94 (0) 712762611
>> Tel      : +94 112 145 345
>>  anuruddhal@wso2.com
>>
>>
>>
>
> --
> Imesh Gunaratne
>
> Technical Lead, WSO2
> Committer & PMC Member, Apache Stratos
>
>

-- 
Sent from Gmail Mobile

Mime
View raw message