cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From France <>
Subject Re: ALARM - ACS reboots host servers!!!
Date Mon, 03 Mar 2014 08:49:28 GMT
I believe this is a bug too, because VMs not running on the storage, get 
destroyed too:

Issue has been around for a long time, like with all others I reported. 
They do not get fixed:

We even lost assignee today.


On 3/3/14 6:55 AM, Koushik Das wrote:
> The primary storage needs to be put in maintenance before doing any upgrade/reboot as
mentioned in the previous mails.
> -Koushik
> On 03-Mar-2014, at 6:07 AM, Marcus <> wrote:
>> Also, please note that in the bug you referenced it doesn't have a
>> problem with the reboot being triggered, but with the fact that reboot
>> never completes due to hanging NFS mount (which is why the reboot
>> occurs, inaccessible primary storage).
>> On Sun, Mar 2, 2014 at 5:26 PM, Marcus <> wrote:
>>> Or do you mean you have multiple primary storages and this one was not
>>> in use and put into maintenance?
>>> On Sun, Mar 2, 2014 at 5:25 PM, Marcus <> wrote:
>>>> I'm not sure I understand. How do you expect to reboot your primary
>>>> storage while vms are running?  It sounds like the host is being
>>>> fenced since it cannot contact the resources it depends on.
>>>> On Sun, Mar 2, 2014 at 3:24 PM, Nux! <> wrote:
>>>>> On 02.03.2014 21:17, Andrei Mikhailovsky wrote:
>>>>>> Hello guys,
>>>>>> I've recently came across the bug CLOUDSTACK-5429 which has rebooted
>>>>>> all of my host servers without properly shutting down the guest vms.
>>>>>> I've simply upgraded and rebooted one of the nfs primary storage
>>>>>> servers and a few minutes later, to my horror, i've found out that
>>>>>> of my host servers have been rebooted. Is it just me thinking so,
>>>>>> is this bug should be fixed ASAP and should be a blocker for any
>>>>>> ACS release. I mean not only does it cause downtime, but also possible
>>>>>> data loss and server corruption.
>>>>> Hi Andrei,
>>>>> Do you have HA enabled and did you put that primary storage in maintenance
>>>>> mode before rebooting it?
>>>>> It's my understanding that ACS relies on the shared storage to perform
HA so
>>>>> if the storage goes it's expected to go berserk. I've noticed similar
>>>>> behaviour in Xenserver pools without ACS.
>>>>> I'd imagine a "cure" for this would be to use network distributed
>>>>> "filesystems" like GlusterFS or CEPH.
>>>>> Lucian
>>>>> --
>>>>> Sent from the Delta quadrant using Borg technology!
>>>>> Nux!

View raw message