cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suresh Sadhu <Suresh.Sa...@citrix.com>
Subject RE: Cloudstack agent keeps rebooting kvm host..
Date Fri, 22 Jun 2012 16:55:34 GMT
HI Alex,

When heartbeat fails ,host will reboot continuously till the problem resolved(heartbeat successful)...


The heartbeat failure might be caused due to fail to  write on mounted storage path,
Did you see any permission denied messages in the logs ..and does your  mounted storage paths
has rw permissions after this problem. because due some corruption in the mounted FS your
 mounted file system might become read-only. That might cause  heart-beat failure.
 


Regards
Sadhu


-----Original Message-----
From: Alexey Zilber [mailto:alexeyzilber@gmail.com] 
Sent: 22 June 2012 21:52
To: cloudstack-users@incubator.apache.org
Subject: Cloudstack agent keeps rebooting kvm host..

Hi,

  The saga continues!  I added a KVM host.  The agent decided it wants to constantly reboot
the server:

2012-06-23 00:11:32,083{GMT} INFO  [cloud.agent.Agent] (Agent-Handler-2:) Startup Response
Received: agent id = 5
2012-06-23 00:11:32,083 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Startup Response
Received: agent id = 5
2012-06-23 00:12:30,187{GMT} WARN  [resource.computing.KVMHAMonitor]
(Thread-7:) write heartbeat failed: Failed to create /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
retry: 0
2012-06-23 00:12:30,187 WARN  [resource.computing.KVMHAMonitor]
(Thread-7:null) write heartbeat failed: Failed to create /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
retry: 0
2012-06-23 00:12:30,209{GMT} WARN  [resource.computing.KVMHAMonitor]
(Thread-7:) write heartbeat failed: Failed to create /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
retry: 1
2012-06-23 00:12:30,209 WARN  [resource.computing.KVMHAMonitor]
(Thread-7:null) write heartbeat failed: Failed to create /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
retry: 1
2012-06-23 00:12:30,232{GMT} WARN  [resource.computing.KVMHAMonitor]
(Thread-7:) write heartbeat failed: Failed to create /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
retry: 2
2012-06-23 00:12:30,232 WARN  [resource.computing.KVMHAMonitor]
(Thread-7:null) write heartbeat failed: Failed to create /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
retry: 2
2012-06-23 00:12:30,254{GMT} WARN  [resource.computing.KVMHAMonitor]
(Thread-7:) write heartbeat failed: Failed to create /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
retry: 3
2012-06-23 00:12:30,254 WARN  [resource.computing.KVMHAMonitor]
(Thread-7:null) write heartbeat failed: Failed to create /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
retry: 3
2012-06-23 00:12:30,275{GMT} WARN  [resource.computing.KVMHAMonitor]
(Thread-7:) write heartbeat failed: Failed to create /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
retry: 4
2012-06-23 00:12:30,275 WARN  [resource.computing.KVMHAMonitor]
(Thread-7:null) write heartbeat failed: Failed to create /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17,
retry: 4
2012-06-23 00:12:30,275{GMT} WARN  [resource.computing.KVMHAMonitor]
(Thread-7:) write heartbeat failed: Failed to create /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17;
reboot the host
2012-06-23 00:12:30,275 WARN  [resource.computing.KVMHAMonitor]
(Thread-7:null) write heartbeat failed: Failed to create /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17;
reboot the host

Broadcast message from root@kvm1.xxxxx.xxxx
        (unknown) at 0:12 ...

The system is going down for reboot NOW!

It looks like the agent was in fact, at least able to create the initial
directory:

[root@kvm1 ~]# ls -al /mnt/9c2be815-de2b-3c14-84bb-54025d782794
total 8
drwxrwxrwx  2 root root 4096 Jun 22 23:58 .
drwxr-xr-x. 4 root root 4096 Jun 22 23:58 ..

Here's the agent properties file:

#Storage
#Sat Jun 23 00:11:32 MYT 2012
guest.network.device=cloudbr0
workers=5
private.network.device=cloudbr0
port=8250
resource=com.cloud.agent.resource.computing.LibvirtComputingResource
pod=1
zone=1
guid=0f0f4f5c-99d0-3813-a7a6-00248cdfd17e
cluster=2
public.network.device=cloudbr0
local.storage.uuid=fbefb2ea-f3e0-4f02-96cb-1b8abb6e8c54
host=10.1.1.18
LibvirtComputingResource.id=5


First time I'm seeing this error...  Last time my kvm setup went well, but KVM was my first
hypervisor, now it's the second.

Thanks!
Alex

Mime
View raw message