cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From France <mailingli...@isg.si>
Subject Re: Primary storage failure
Date Thu, 04 Jul 2013 15:32:32 GMT
I've submitted a bug:
https://issues.apache.org/jira/browse/CLOUDSTACK-3367

On 3/7/13 8:39 PM, David Nalley wrote:
> This warrants a bug IMO.
>
> --David
>
> On Wed, Jul 3, 2013 at 2:38 PM, Geoff Higginbottom
> <geoff.higginbottom@shapeblue.com> wrote:
>> Dean,
>>
>> I am guessing you are using NFS for your Primary Storage.
>>
>> This is actually 'by design'.  The logic is that if the storage goes offline, then
all VMs must have also failed, and a 'forced' reboot of the Host 'might' automatically fix
things.
>>
>> This is great if you only have one Primary Storage, but typically you have more than
one, so whilst the reboot might fix the failed storage, it will also kill off all the perfectly
good VMs which were still happily running.
>>
>> The fix for XenServer Hosts is to:
>>
>> 1. Modify /opt/xensource/bin/xenheartbeat.sh on all your Hosts, commenting out the
two entries which have "reboot -f"
>>
>> 2. Identify the PID of the script  - pidof -x xenheartbeat.sh
>>
>> 3. Restart the Script  - kill <pid>
>>
>> 4. Force reconnect Host from the UI,  the script will then re-launch on reconnect
>>
>> If you running KVM, I'm guessing there is a similar script, but I have not tried
this yet for anything other than XenSever (it does not apply to ESXi)
>>
>> Regards
>>
>> Geoff Higginbottom
>>
>> D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581
>>
>> geoff.higginbottom@shapeblue.com
>>
>>
>> -----Original Message-----
>> From: Dean Kamali [mailto:dean.kamali@gmail.com]
>> Sent: 03 July 2013 19:14
>> To: users@cloudstack.apache.org
>> Subject: Primary storage failure
>>
>> Hello everyone
>>
>> I'm testing failure scenarios, and I have noticed that as soon as the primary storage
gets offline.
>>
>> cloudstack management server seems to think that the hypervisor is not responding
and it will reboot the node, if you have number of of nodes it will eventually reboot all
of them. (losing everything  .. fun! )
>>
>> What if I have multiple primary storage and one of them failed? it will reboot all
of my hypervisors? it doesn't seems right to me.
>>
>> Is there is a way to control this behavior?
>>
>> it seems that cloud stack management server needs to be a little smarter.
>> This email and any attachments to it may be confidential and are intended solely
for the use of the individual to whom it is addressed. Any views or opinions expressed are
solely those of the author and do not necessarily represent those of Shape Blue Ltd or related
companies. If you are not the intended recipient of this email, you must neither take any
action based upon its contents, nor copy or show it to anyone. Please contact the sender if
you believe you have received this email in error. Shape Blue Ltd is a company incorporated
in England & Wales. ShapeBlue Services India LLP is operated under license from Shape
Blue Ltd. ShapeBlue is a registered trademark.
>>


Mime
View raw message