cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephen Turner <Stephen.Tur...@citrix.com>
Subject RE: [DISCUSS] XenServer and HA: the way forward
Date Wed, 27 May 2015 09:45:00 GMT
I'm sorry to come late to this thread, but I only picked it up from Remi's blog post [*] over
the weekend.

I'm certainly not going to defend the way this change came in under the radar, but speaking
as a member of the XenServer development team, I wouldn't want to go back to the old behaviour.
The risk is not just theoretical: we had at least one customer with serious data corruption
problems as a result of the bad interaction between the CloudStack code and XenServer. I wonder
if there's an alternative possibility where CloudStack makes sure that XenServer HA is turned
on, and turns it on itself / gives you warnings if it isn't / something?

-- 
Stephen Turner

[*] http://blog.remibergsma.com/2015/05/23/making-xenserver-and-cloudstack-sing-and-dance-together-again/



-----Original Message-----
From: Remi Bergsma [mailto:remi@remi.nl] 
Sent: 04 May 2015 11:04
To: dev@cloudstack.apache.org
Subject: [DISCUSS] XenServer and HA: the way forward

Hi all,

Since CloudStack 4.4 the implementation of HA in CloudStack was changed to use the XenHA feature
of XenServer. As of 4.4, it is expected to have XenHA enabled for the pool (not for the VMs!)
and so XenServer will be the one to elect a new pool master, whereas CloudStack did it before.
Also, XenHA takes care of fencing the box instead of CloudStack should storage be unavailable.
To be exact, they both try to fence but XenHA is usually faster.

To be 100% clear: HA on VMs is in all cases done by CloudStack. It's just that without a pool
master, no VMs will be recovered anyway. This brought some headaches to me, as first of all
I didn't know. We probably need to document this somewhere. This is important, because without
XenHA turned on you'll not get a new pool master (a behaviour change).

Personally, I don't like the fact that we have "two captains" in case something goes wrong.
But, some say they like this behaviour. I'm OK with both, as long as one can choose whatever
suits their needs best.

In Austin I talked to several people about this. We came up with the idea to have CloudStack
check whether XenHA is on or not. If it is, it does the current 4.4+ behaviour (XenHA selects
new pool master). When it is not, we do the CloudStack 4.3 behaviour where CloudStack is fully
in control.

I also talked to Tim Mackey and he wants to help implement this, but he doesn't have much
time. The idea is to have someone else join in to code the change and then Tim will be able
to help out on a regularly basis should we need in depth knowledge of XenServer or its implementation
in CloudStack.

Before we kick this off, I'd like to discuss and agree that this is the way forward. Also,
if you're interested in joining this effort let me know and I'll kick it off.

Regards,
Remi
Mime
View raw message