cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Francois Nadeau <the.jfnad...@gmail.com>
Subject Re: Recover VM after KVM host down (and HA not working) ?
Date Sat, 23 Dec 2017 14:49:51 GMT
I'd really like to get at the bottom of this.    It does sound like the
behavior mentioned in https://issues.apache.org/jira/browse/CLOUDSTACK-5582
but should be long fixed.

One suspect log entry (be unrelated) I noticed is this recurring exception
in the manger logs :

ERROR [c.c.v.UserVmManagerImpl] (UserVm-ipfetch-3:ctx-d4c44c2b)
(logid:16dd70ad) Caught the Exception in VmIpFetchTask

Which I guess is caused by the use of an external DHCP so manager fails to
determine a running VM IP.    Which brings me to my next question.... how
is a VM marked for HA actually monitored ?


On Sat, Dec 23, 2017 at 3:38 AM, Eric Green <eric.lee.green@gmail.com>
wrote:

> If all else fails, change its state to the correct  state in the MySQL
> database and restart the management  service. Sadly that is the only way I
> could do it when my Cloudstack got confused and stuck an instance in an
> intermediate state where I couldn't do anything with it.
>
> On Dec 22, 2017 at 9:09 AM, <Jean-Francois Nadeau <the.jfnadeau@gmail.com
> >>
> wrote:
>
> Good morning,
>
> New to ACS and doing a POC with 4.10 on Centos 7 and KVM.
>
> Im trying to recover VMs after an host failure (powered off from OOB).
>
> Primary storage is NFS and IPMI is configured for the KVM hosts.  Zone is
> advanced mode with vlan separation and created a shared network with no
> services since I wish to use an external DHCP.
>
> First,  say I don't have a compute offering with HA enabled and a KVM host
> goes down...  I can't put it in maintenance mode while down and disabling
> it have no effect on the state of the lost VMs.  VM stays in running state
> according to manager.   What should I do to force restart on remaining
> healthy hosts ?
>
> Then I enabled  IPMI on all KVM hosts and attempted the same experience
> with a compute offering with HA enabled.   Same result.  Manager do see the
> host as disconnected and powered off but take no action.   I certainly miss
> something here.  Please help !
>
> Regards,
>
> Jean-Francois
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message