cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anthony Xu <Xuefei...@citrix.com>
Subject [PROPOSAL] Use XS HA to swithc XS master host when master host is down
Date Mon, 03 Mar 2014 19:19:40 GMT
Hi All,
I would like to propose using XS HA to switch XS master host when XS master host is down

Reason,
We found below issue recently,
https://issues.apache.org/jira/browse/CLOUDSTACK-6177

When XS master is down, CS uses pool-emergency-transition-to-master and pool-recover-slaves
API to choose a new master, this API is not safe, and should be only used in emergent situation,
this API may cause XS use a little bit old(5 seconds old) version of XS DB, some of object
may be missing in the old XS DB, which may cause weird behavior, you may not be able to start
VM.


Short term solution

CS doesn't do XS master switch any more to avoid this issue.

Impact,

1.      When master host is down, CS loses connect to the whole XS pool(CS cluster), CS cannot
get VMs info in this cluster, and the whole cluster is not operable.

2.      Require admin to recover the XS master host manually, if recovering XS master host
is not possible, admin can use uses pool-emergency-transition-to-master and pool-recover-slaves
to recover the pool, per the issue I mentioned before , this should be the last resort.

Long term solution

Integrate XS HA, use XS HA to do XS master switch.

1.      It might take  some time to integrate XS HA.

2.      Old free version XS doesn't have XS HA feature, user might need to upgrade to XS 6.2(
which is free) to get the feature.


I think we can fix this issue in two steps.

1.      Since this issue is very critical, CS should not  do XS master switch immediately
to avoid this issue.

2.      Integrate XS HA.


Comments, suggestions are highly appreciated!

Best Regards.
Anthony

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message