cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From France <mailingli...@isg.si>
Subject Re: ALARM - ACS reboots host servers!!!
Date Mon, 03 Mar 2014 08:49:28 GMT
I believe this is a bug too, because VMs not running on the storage, get 
destroyed too:

Issue has been around for a long time, like with all others I reported. 
They do not get fixed:
https://issues.apache.org/jira/browse/CLOUDSTACK-3367

We even lost assignee today.

Regards,
F.

On 3/3/14 6:55 AM, Koushik Das wrote:
> The primary storage needs to be put in maintenance before doing any upgrade/reboot as
mentioned in the previous mails.
>
> -Koushik
>
> On 03-Mar-2014, at 6:07 AM, Marcus <shadowsor@gmail.com> wrote:
>
>> Also, please note that in the bug you referenced it doesn't have a
>> problem with the reboot being triggered, but with the fact that reboot
>> never completes due to hanging NFS mount (which is why the reboot
>> occurs, inaccessible primary storage).
>>
>> On Sun, Mar 2, 2014 at 5:26 PM, Marcus <shadowsor@gmail.com> wrote:
>>> Or do you mean you have multiple primary storages and this one was not
>>> in use and put into maintenance?
>>>
>>> On Sun, Mar 2, 2014 at 5:25 PM, Marcus <shadowsor@gmail.com> wrote:
>>>> I'm not sure I understand. How do you expect to reboot your primary
>>>> storage while vms are running?  It sounds like the host is being
>>>> fenced since it cannot contact the resources it depends on.
>>>>
>>>> On Sun, Mar 2, 2014 at 3:24 PM, Nux! <nux@li.nux.ro> wrote:
>>>>> On 02.03.2014 21:17, Andrei Mikhailovsky wrote:
>>>>>> Hello guys,
>>>>>>
>>>>>>
>>>>>> I've recently came across the bug CLOUDSTACK-5429 which has rebooted
>>>>>> all of my host servers without properly shutting down the guest vms.
>>>>>> I've simply upgraded and rebooted one of the nfs primary storage
>>>>>> servers and a few minutes later, to my horror, i've found out that
all
>>>>>> of my host servers have been rebooted. Is it just me thinking so,
or
>>>>>> is this bug should be fixed ASAP and should be a blocker for any
new
>>>>>> ACS release. I mean not only does it cause downtime, but also possible
>>>>>> data loss and server corruption.
>>>>>
>>>>> Hi Andrei,
>>>>>
>>>>> Do you have HA enabled and did you put that primary storage in maintenance
>>>>> mode before rebooting it?
>>>>> It's my understanding that ACS relies on the shared storage to perform
HA so
>>>>> if the storage goes it's expected to go berserk. I've noticed similar
>>>>> behaviour in Xenserver pools without ACS.
>>>>> I'd imagine a "cure" for this would be to use network distributed
>>>>> "filesystems" like GlusterFS or CEPH.
>>>>>
>>>>> Lucian
>>>>>
>>>>> --
>>>>> Sent from the Delta quadrant using Borg technology!
>>>>>
>>>>> Nux!
>>>>> www.nux.ro


Mime
View raw message