cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nik Martin <nik.mar...@nfinausa.com>
Subject reconnecting to host in alert state - cloud cocked up
Date Wed, 16 Jan 2013 15:12:32 GMT
Ok, this is a new thread centered on a serious problem in my 3.02 CS 
cloud, running Xenserver 6.02 hosts.  Here is what has transpired so far:
1: user reports console proxy not available
2: confirm console proxy not available, issue reboot via cloudstack UI
3: CS reports VM booted ok, still unavailable
4: tried to migrate to different host, VM stuck in migrating state
5: Log in to host, list_domains command does not show VM , but shows a 
domain in this state:
117 | deadbeef-dead-beef-dead-beef00000075 | DS
which is a pretty bad sign that the VM is hung pretty badly.
6: attempt to destroy domain according to Citrix Support article:
/opt/xensource/debug/destroy_domain -domid 117
7: command hangs
8: I then restart xe api toolstack, it appears to restart fine. I should 
note that ALL vms are on this host via the "first_fit" vm provisioning 
algorithm
9: I attempt to start migrating VMs to two other available hosts in 
preparation for a hard reboot of host
10: migrating VMs fails, and host is now in alert state in CS, and CS 
log states that host is unavailable. Force reconnect fails.

So, here I am, in a production environment with a scenario that the 
whole premise of cloud based computing is specifically designed to 
address, and it is the root cause of the issue it is intended to prevent.

Do I have any other options to prevent down time? I have exhausted 
everything I know to do.   have already scheduled a maintenance window, 
and fudged the truth to my customers stating that there should be no 
downtime during this window, which I have 0 faith will actually be true.


-- 

Regards,

Nik

Nik Martin
nfina Technologies, Inc.
+1.251.243.0043 x1003
http://nfinausa.com
Relentless Reliability

Mime
View raw message