cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luciano Castro <luciano.cas...@gmail.com>
Subject Re: HA feature - KVM - CloudStack 4.5.1
Date Fri, 17 Jul 2015 23:36:11 GMT
Hi!

I'm using NFS at primary and secondary storage. I did graceceful shutdown at KVM host A.

Thanks

Enviado via iPhone
Luciano

> Em 17/07/2015, às 19:00, Milamber <milamber@apache.org> escreveu:
> 
> 
> 
>> On 17/07/2015 21:23, Somesh Naidu wrote:
>> Ok, so here are my findings.
>> 
>> 1. Host ID 3 was shutdown around 2015-07-16 12:19:09 at which point management server
called a disconnect.
>> 2. Based on the logs, it seems VM IDs 32, 18, 39 and 46 were running on the host.
>> 3. No HA tasks for any of these VMs at this time.
>> 5. Management server restarted at around 2015-07-16 12:30:20.
>> 6. Host ID 3 connected back at around 2015-07-16 12:44:08.
>> 7. Management server identified the missing VMs and triggered HA on those.
>> 8. The VMs were eventually started, all 4 of them.
>> 
>> I am not 100% sure why HA wasn't triggered until 2015-07-16 12:30 (#3), but I know
that management server restart caused it not happen until the host was reconnected.
> 
> Perhaps, the management server don't reconize the host 3 totally down (ping alive? or
some quorum don't ok)
> The only way to the mgt server to accept totally that the host 3 has a real problem that
the host 3 has been reboot (around 12:44)?
> 
> What is the storage subsystem? CLVMd?
> 
> 
>> 
>> Regards,
>> Somesh
>> 
>> 
>> -----Original Message-----
>> From: Luciano Castro [mailto:luciano.castro@gmail.com]
>> Sent: Friday, July 17, 2015 12:13 PM
>> To: users@cloudstack.apache.org
>> Subject: Re: HA feature - KVM - CloudStack 4.5.1
>> 
>> No problems Somesh, thanks for your help.
>> 
>> Link of log:
>> 
>> https://dl.dropboxusercontent.com/u/6774061/management-server.log.2015-07-16.gz
>> 
>> Luciano
>> 
>> On Fri, Jul 17, 2015 at 12:00 PM, Somesh Naidu <Somesh.Naidu@citrix.com>
>> wrote:
>> 
>>> How large is the management server logs dated 2015-07-16? I would like to
>>> review the logs. All the information I need from that incident should be in
>>> there so I don't need any more testing.
>>> 
>>> Regards,
>>> Somesh
>>> 
>>> -----Original Message-----
>>> From: Luciano Castro [mailto:luciano.castro@gmail.com]
>>> Sent: Friday, July 17, 2015 7:58 AM
>>> To: users@cloudstack.apache.org
>>> Subject: Re: HA feature - KVM - CloudStack 4.5.1
>>> 
>>> Hi Somesh!
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> [root@1q2 ~]# zgrep -i -E
>>> 
>>> 'SimpleIvestigator|KVMInvestigator|PingInvestigator|ManagementIPSysVMInvestigator'
>>> /var/log/cloudstack/management/management-server.log.2015-07-16.gz |tail
>>> -5000 > /tmp/management.txt
>>> [root@1q2 ~]# cat /tmp/management.txt
>>> 2015-07-16 12:30:45,452 DEBUG [o.a.c.s.l.r.ExtensionRegistry] (main:null)
>>> Registering extension [KVMInvestigator] in [Ha Investigators Registry]
>>> 2015-07-16 12:30:45,452 DEBUG [o.a.c.s.l.r.RegistryLifecycle] (main:null)
>>> Registered com.cloud.ha.KVMInvestigator@57ceec9a
>>> 2015-07-16 12:30:45,927 DEBUG [o.a.c.s.l.r.ExtensionRegistry] (main:null)
>>> Registering extension [PingInvestigator] in [Ha Investigators Registry]
>>> 2015-07-16 12:30:45,928 DEBUG [o.a.c.s.l.r.ExtensionRegistry] (main:null)
>>> Registering extension [ManagementIPSysVMInvestigator] in [Ha Investigators
>>> Registry]
>>> 2015-07-16 12:30:53,796 INFO  [o.a.c.s.l.r.DumpRegistry] (main:null)
>>> Registry [Ha Investigators Registry] contains [SimpleInvestigator,
>>> XenServerInvestigator, KVMInv
>>> 
>>> I  searched  this log before, but as I thought that had not nothing
>>> special.
>>> 
>>> If you want propose to me another scenario of test, I can do it.
>>> 
>>> Thanks
>>> 
>>> 
>>> On Thu, Jul 16, 2015 at 7:27 PM, Somesh Naidu <Somesh.Naidu@citrix.com>
>>> wrote:
>>> 
>>>> What about other investigators, specifically " KVMInvestigator,
>>>> PingInvestigator"? They report the VMs as alive=false too?
>>>> 
>>>> Also, it is recommended that you look at the management-sever.log instead
>>>> of catalina.out (for one, the latter doesn’t have timestamp).
>>>> 
>>>> Regards,
>>>> Somesh
>>>> 
>>>> 
>>>> -----Original Message-----
>>>> From: Luciano Castro [mailto:luciano.castro@gmail.com]
>>>> Sent: Thursday, July 16, 2015 1:14 PM
>>>> To: users@cloudstack.apache.org
>>>> Subject: Re: HA feature - KVM - CloudStack 4.5.1
>>>> 
>>>> Hi Somesh!
>>>> 
>>>> 
>>>> thanks for help.. I did again ,and I collected new logs:
>>>> 
>>>> My vm_instance name is i-2-39-VM. There was some routers in KVM host 'A'
>>>> (this one that I powered off now):
>>>> 
>>>> 
>>>> [root@1q2 ~]# grep -i -E 'SimpleInvestigator.*false'
>>>> /var/log/cloudstack/management/catalina.out
>>>> INFO  [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-2:ctx-e2f91c9c
>>> work-3)
>>>> SimpleInvestigator found VM[DomainRouter|r-4-VM]to be alive? false
>>>> INFO  [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-1:ctx-729acf4f
>>> work-7)
>>>> SimpleInvestigator found VM[User|i-23-33-VM]to be alive? false
>>>> INFO  [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-4:ctx-a66a4941
>>> work-8)
>>>> SimpleInvestigator found VM[DomainRouter|r-36-VM]to be alive? false
>>>> INFO  [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-1:ctx-5977245e
>>>> work-10) SimpleInvestigator found VM[User|i-17-26-VM]to be alive? false
>>>> INFO  [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-1:ctx-c7f39be0
>>> work-9)
>>>> SimpleInvestigator found VM[DomainRouter|r-32-VM]to be alive? false
>>>> INFO  [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-ad4f5fda
>>>> work-10) SimpleInvestigator found VM[DomainRouter|r-46-VM]to be alive?
>>>> false
>>>> INFO  [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-0:ctx-0257f5af
>>>> work-11) SimpleInvestigator found VM[User|i-4-52-VM]to be alive? false
>>>> INFO  [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-4:ctx-7ddff382
>>>> work-12) SimpleInvestigator found VM[DomainRouter|r-32-VM]to be alive?
>>>> false
>>>> INFO  [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-1:ctx-9f79917e
>>>> work-13) SimpleInvestigator found VM[User|i-2-39-VM]to be alive? false
>>>> 
>>>> 
>>>> 
>>>> KVM  host 'B' agent log (where the machine would be migrate):
>>>> 
>>>> 2015-07-16 16:58:56,537 INFO  [kvm.resource.LibvirtComputingResource]
>>>> (agentRequest-Handler-4:null) Live migration of instance i-2-39-VM
>>>> initiated
>>>> 2015-07-16 16:58:57,540 INFO  [kvm.resource.LibvirtComputingResource]
>>>> (agentRequest-Handler-4:null) Waiting for migration of i-2-39-VM to
>>>> complete, waited 1000ms
>>>> 2015-07-16 16:58:58,541 INFO  [kvm.resource.LibvirtComputingResource]
>>>> (agentRequest-Handler-4:null) Waiting for migration of i-2-39-VM to
>>>> complete, waited 2000ms
>>>> 2015-07-16 16:58:59,542 INFO  [kvm.resource.LibvirtComputingResource]
>>>> (agentRequest-Handler-4:null) Waiting for migration of i-2-39-VM to
>>>> complete, waited 3000ms
>>>> 2015-07-16 16:59:00,543 INFO  [kvm.resource.LibvirtComputingResource]
>>>> (agentRequest-Handler-4:null) Waiting for migration of i-2-39-VM to
>>>> complete, waited 4000ms
>>>> 2015-07-16 16:59:01,245 INFO  [kvm.resource.LibvirtComputingResource]
>>>> (agentRequest-Handler-4:null) Migration thread for i-2-39-VM is done
>>>> 
>>>> It said done for my i-2-39-VM instance, but I can´t ping this host.
>>>> 
>>>> Luciano
>>> 
>>> 
>>> --
>>> Luciano Castro
> 

Mime
View raw message