cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexey Zilber <alexeyzil...@gmail.com>
Subject Re: Cloudstack agent keeps rebooting kvm host..
Date Fri, 22 Jun 2012 17:05:48 GMT
So, I noticed in the agent.log, at the very top, there were missing jars:

/usr/bin/build-classpath: error: Could not find commons-collections.jar
Java extension for this JVM
/usr/bin/build-classpath: error: Could not find commons-dbcp.jar Java
extension for this JVM
/usr/bin/build-classpath: error: Could not find commons-pool.jar Java
extension for this JVM
/usr/bin/build-classpath: error: Could not find ws-commons-util.jar Java
extension for this JVM
/usr/bin/build-classpath: error: Could not find jnetpcap.jar Java extension
for this JVM
/usr/bin/build-classpath: error: Could not find tomcat6-servlet-2.5-api.jar
Java extension for this JVM
/usr/bin/build-classpath: error: Could not find
tomcat6-el-2.1-api-6.0.24.jar Java extension for this JVM
/usr/bin/build-classpath: error: Could not find
tomcat6-jsp-2.1-api-6.0.24.jar Java extension for this JVM
/usr/bin/build-classpath: error: Some specified jars were not found

I installed all the missing packages from the missing jars.  Didn't help
any...

I put the host into maintenance mode, so after it restarted I kept seeing
this in the logs:

2012-06-23 00:43:16,688{GMT} INFO  [utils.nio.NioClient] (Agent-Selector:)
Connecting to 10.1.1.18:8250
2012-06-23 00:43:16,688 INFO  [utils.nio.NioClient] (Agent-Selector:null)
Connecting to 10.1.1.18:8250
2012-06-23 00:43:16,774{GMT} INFO  [utils.nio.NioClient] (Agent-Selector:)
SSL: Handshake done
2012-06-23 00:43:16,774 INFO  [utils.nio.NioClient] (Agent-Selector:null)
SSL: Handshake done
2012-06-23 00:43:17,274{GMT} INFO  [cloud.agent.Agent] (Agent-Handler-2:)
Proccess agent startup answer, agent id = 5
2012-06-23 00:43:17,274 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
Proccess agent startup answer, agent id = 5
2012-06-23 00:43:17,274{GMT} INFO  [cloud.agent.Agent] (Agent-Handler-2:)
Set agent id 5
2012-06-23 00:43:17,274 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
Set agent id 5
2012-06-23 00:43:17,275{GMT} INFO  [cloud.agent.Agent] (Agent-Handler-2:)
Startup Response Received: agent id = 5
2012-06-23 00:43:17,275 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
Startup Response Received: agent id = 5
2012-06-23 00:43:17,399{GMT} WARN  [cloud.agent.Agent] (Agent-Handler-4:)
Unable to send response: null
2012-06-23 00:43:17,399 WARN  [cloud.agent.Agent] (Agent-Handler-4:null)
Unable to send response: null
2012-06-23 00:43:17,411{GMT} WARN  [cloud.agent.Agent] (UgentTask-5:)
Unable to send request: null
2012-06-23 00:43:17,411 WARN  [cloud.agent.Agent] (UgentTask-5:null) Unable
to send request: null
2012-06-23 00:43:21,775{GMT} INFO  [cloud.agent.Agent] (Agent-Handler-5:)
Connected to the server
2012-06-23 00:43:21,775 INFO  [cloud.agent.Agent] (Agent-Handler-5:null)
Connected to the server
2012-06-23 00:43:22,295{GMT} INFO  [cloud.agent.Agent] (Agent-Handler-5:)
Lost connection to the server. Dealing with the remaining commands...
2012-06-23 00:43:22,295 INFO  [cloud.agent.Agent] (Agent-Handler-5:null)
Lost connection to the server. Dealing with the remaining commands...
2012-06-23 00:43:27,297{GMT} INFO  [cloud.agent.Agent] (Agent-Handler-5:)
Reconnecting...
2012-06-23 00:43:27,297 INFO  [cloud.agent.Agent] (Agent-Handler-5:null)
Reconnecting...
2012-06-23 00:43:27,298{GMT} INFO  [utils.nio.NioClient] (Agent-Selector:)
Connecting to 10.1.1.18:8250
2012-06-23 00:43:27,298 INFO  [utils.nio.NioClient] (Agent-Selector:null)
Connecting to 10.1.1.18:8250
2012-06-23 00:43:27,384{GMT} INFO  [utils.nio.NioClient] (Agent-Selector:)
SSL: Handshake done
2012-06-23 00:43:27,384 INFO  [utils.nio.NioClient] (Agent-Selector:null)
SSL: Handshake done
2012-06-23 00:43:27,907{GMT} INFO  [cloud.agent.Agent] (Agent-Handler-2:)
Proccess agent startup answer, agent id = 5
2012-06-23 00:43:27,907 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
Proccess agent startup answer, agent id = 5
2012-06-23 00:43:27,907{GMT} INFO  [cloud.agent.Agent] (Agent-Handler-2:)
Set agent id 5
2012-06-23 00:43:27,907 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
Set agent id 5
2012-06-23 00:43:27,908{GMT} INFO  [cloud.agent.Agent] (Agent-Handler-2:)
Startup Response Received: agent id = 5
2012-06-23 00:43:27,908 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
Startup Response Received: agent id = 5
2012-06-23 00:43:28,033{GMT} WARN  [cloud.agent.Agent] (Agent-Handler-4:)
Unable to send response: null
2012-06-23 00:43:28,033 WARN  [cloud.agent.Agent] (Agent-Handler-4:null)
Unable to send response: null
2012-06-23 00:43:28,045{GMT} WARN  [cloud.agent.Agent] (UgentTask-5:)
Unable to send request: null
2012-06-23 00:43:28,045 WARN  [cloud.agent.Agent] (UgentTask-5:null) Unable
to send request: null


Constantly.  But no reboots.

Took it out of maintenance, and reboots started.

2012-06-23 00:44:36,787{GMT} INFO  [cloud.agent.Agent] (Agent-Handler-2:)
Startup Response Received: agent id = 5
2012-06-23 00:44:36,787 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
Startup Response Received: agent id = 5
2012-06-23 00:45:35,713{GMT} WARN  [resource.computing.KVMHAMonitor]
(Thread-7:) write heartbeat failed: Failed to create
/mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 0
2012-06-23 00:45:35,713 WARN  [resource.computing.KVMHAMonitor]
(Thread-7:null) write heartbeat failed: Failed to create
/mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 0
2012-06-23 00:45:35,736{GMT} WARN  [resource.computing.KVMHAMonitor]
(Thread-7:) write heartbeat failed: Failed to create
/mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 1
2012-06-23 00:45:35,736 WARN  [resource.computing.KVMHAMonitor]
(Thread-7:null) write heartbeat failed: Failed to create
/mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 1
2012-06-23 00:45:35,759{GMT} WARN  [resource.computing.KVMHAMonitor]
(Thread-7:) write heartbeat failed: Failed to create
/mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 2
2012-06-23 00:45:35,759 WARN  [resource.computing.KVMHAMonitor]
(Thread-7:null) write heartbeat failed: Failed to create
/mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 2
2012-06-23 00:45:35,782{GMT} WARN  [resource.computing.KVMHAMonitor]
(Thread-7:) write heartbeat failed: Failed to create
/mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 3
2012-06-23 00:45:35,782 WARN  [resource.computing.KVMHAMonitor]
(Thread-7:null) write heartbeat failed: Failed to create
/mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 3
2012-06-23 00:45:35,805{GMT} WARN  [resource.computing.KVMHAMonitor]
(Thread-7:) write heartbeat failed: Failed to create
/mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 4
2012-06-23 00:45:35,805 WARN  [resource.computing.KVMHAMonitor]
(Thread-7:null) write heartbeat failed: Failed to create
/mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 4
2012-06-23 00:45:35,805{GMT} WARN  [resource.computing.KVMHAMonitor]
(Thread-7:) write heartbeat failed: Failed to create
/mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17; reboot the
host
2012-06-23 00:45:35,805 WARN  [resource.computing.KVMHAMonitor]
(Thread-7:null) write heartbeat failed: Failed to create
/mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17; reboot the
host

Broadcast message from root@kvm1.xxxxx.xxx
        (unknown) at 0:45 ...

The system is going down for reboot NOW!
Killing VMOps Agent (PID 5417) with SIGTERM
Waiting for agent to exit
2012-06-23 00:45:43,203{GMT} INFO  [cloud.agent.Agent]
(AgentShutdownThread:) Stopping the agent: Reason = sig.kill
2012-06-23 00:45:43,203 INFO  [cloud.agent.Agent]
(AgentShutdownThread:null) Stopping the agent: Reason = sig.kill
libvir: RPC error : Cannot write data: Broken pipe
libvir: RPC error : Failed to connect socket to
'/var/run/libvirt/libvirt-sock': No such file or directory


Logs from the management server:

2012-06-23 00:45:12,024 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(AgentManager-Handler-15:null) Cleanup succeeded. Details null
2012-06-23 00:45:12,024 DEBUG [agent.transport.Request]
(StatsCollector-2:null) Seq 5-696975367: Received:  { Ans: , MgmtId:
107158699232, via: 5, V
er: v1, Flags: 10, { GetHostStatsAnswer } }
2012-06-23 00:45:12,025 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(StatsCollector-2:null) Cleanup succeeded. Details null
2012-06-23 00:45:31,087 DEBUG
[storage.secondary.SecondaryStorageManagerImpl] (secstorage-1:null) Zone 1
is ready to launch secondary storage VM
2012-06-23 00:45:31,376 DEBUG [agent.manager.AgentManagerImpl]
(AgentManager-Handler-2:null) Ping from 3
2012-06-23 00:45:31,640 DEBUG [cloud.consoleproxy.ConsoleProxyManagerImpl]
(consoleproxy-1:null) Zone 1 is ready to launch console proxy
2012-06-23 00:45:32,950 DEBUG
[network.router.VirtualNetworkApplianceManagerImpl]
(RouterStatusMonitor-1:null) Found 0 routers.
2012-06-23 00:45:36,758 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-143:null) Ping from 1
2012-06-23 00:45:36,845 DEBUG [agent.manager.AgentManagerImpl]
(AgentManager-Handler-9:null) Ping from 5
2012-06-23 00:45:43,170 DEBUG [agent.manager.AgentManagerImpl]
(AgentManager-Handler-4:null) SeqA 5--1: Processing Seq 5--1:  { Cmd ,
MgmtId: -1, vi
a: 5, Ver: v1, Flags: 111,
[{"ShutdownCommand":{"reason":"sig.kill","wait":0}}] }
2012-06-23 00:45:43,171 INFO  [agent.manager.AgentManagerImpl]
(AgentManager-Handler-4:null) Host 5 has informed us that it is shutting
down with re
ason sig.kill and detail null
2012-06-23 00:45:43,172 INFO  [agent.manager.AgentManagerImpl]
(AgentTaskPool-7:null) Host 5 is disconnecting with event ShutdownRequested
2012-06-23 00:45:43,176 DEBUG [agent.manager.AgentManagerImpl]
(AgentTaskPool-7:null) The next status of agent 5is Disconnected, current
status is U
p
2012-06-23 00:45:43,176 DEBUG [agent.manager.AgentManagerImpl]
(AgentTaskPool-7:null) Deregistering link for 5 with state Disconnected
2012-06-23 00:45:43,176 DEBUG [agent.manager.AgentManagerImpl]
(AgentTaskPool-7:null) Remove Agent : 5
2012-06-23 00:45:43,176 DEBUG [agent.manager.ConnectedAgentAttache]
(AgentTaskPool-7:null) Processing Disconnect.
2012-06-23 00:45:43,176 DEBUG [agent.manager.AgentAttache]
(AgentTaskPool-7:null) Seq 5-696975361: Sending disconnect to class
com.cloud.network.sec
urity.SecurityGroupListener
2012-06-23 00:45:43,176 DEBUG [agent.manager.AgentManagerImpl]
(AgentTaskPool-7:null) Sending Disconnect to listener:
com.cloud.hypervisor.xen.disco
verer.XcpServerDiscoverer$$EnhancerByCGLIB$$237690f3
2012-06-23 00:45:43,176 DEBUG [agent.manager.AgentManagerImpl]
(AgentTaskPool-7:null) Sending Disconnect to listener:
com.cloud.hypervisor.vmware.Vm
wareManagerImpl$$EnhancerByCGLIB$$61c8b63f
2012-06-23 00:45:43,176 DEBUG [agent.manager.AgentManagerImpl]
(AgentTaskPool-7:null) Sending Disconnect to listener:
com.cloud.vm.ClusteredVirtualM
achineManagerImpl$$EnhancerByCGLIB$$203f1b98
2012-06-23 00:45:43,176 DEBUG [agent.manager.AgentManagerImpl]
(AgentTaskPool-7:null) Sending Disconnect to listener:
com.cloud.network.security.Sec
urityGroupListener
2012-06-23 00:45:43,176 DEBUG [agent.manager.AgentManagerImpl]
(AgentTaskPool-7:null) Sending Disconnect to listener:
com.cloud.storage.listener.Sto
ragePoolMonitor
2012-06-23 00:45:43,176 DEBUG [agent.manager.AgentManagerImpl]
(AgentTaskPool-7:null) Sending Disconnect to listener:
com.cloud.storage.secondary.Se
condaryStorageListener
2012-06-23 00:45:43,177 DEBUG [agent.manager.AgentManagerImpl]
(AgentTaskPool-7:null) Sending Disconnect to listener:
com.cloud.network.NetworkManag
erImpl$$EnhancerByCGLIB$$c9fefab4
2012-06-23 00:45:43,177 DEBUG [agent.manager.AgentManagerImpl]
(AgentTaskPool-7:null) Sending Disconnect to listener:
com.cloud.agent.manager.AgentM
onitor$$EnhancerByCGLIB$$3a9f7d14
2012-06-23 00:45:43,177 DEBUG [agent.manager.AgentManagerImpl]
(AgentTaskPool-7:null) Sending Disconnect to listener:
com.cloud.storage.download.DownloadListener
2012-06-23 00:45:43,177 DEBUG [agent.manager.AgentManagerImpl]
(AgentTaskPool-7:null) Sending Disconnect to listener:
com.cloud.storage.upload.UploadListener
2012-06-23 00:45:43,177 DEBUG [agent.manager.AgentManagerImpl]
(AgentTaskPool-7:null) Sending Disconnect to listener:
com.cloud.network.SshKeysDistriMonitor
2012-06-23 00:45:43,177 DEBUG [agent.manager.AgentManagerImpl]
(AgentTaskPool-7:null) Sending Disconnect to listener:
com.cloud.network.router.VirtualNetworkApplianceManagerImpl$$EnhancerByCGLIB$$4b2da131
2012-06-23 00:45:43,177 DEBUG [agent.manager.AgentManagerImpl]
(AgentTaskPool-7:null) Sending Disconnect to listener:
com.cloud.capacity.StorageCapacityListener
2012-06-23 00:45:43,177 DEBUG [agent.manager.AgentManagerImpl]
(AgentTaskPool-7:null) Sending Disconnect to listener:
com.cloud.capacity.ComputeCapacityListener
2012-06-23 00:45:43,177 DEBUG [agent.manager.AgentManagerImpl]
(AgentTaskPool-7:null) Sending Disconnect to listener:
com.cloud.consoleproxy.ConsoleProxyListener
2012-06-23 00:45:43,180 DEBUG [agent.manager.AgentManagerImpl]
(AgentTaskPool-7:null) Sending Disconnect to listener:
com.cloud.network.NetworkUsageManagerImpl$DirectNetworkStatsListener
2012-06-23 00:45:43,180 DEBUG [cloud.network.NetworkUsageManagerImpl]
(AgentTaskPool-7:null) Disconnected called on 5 with status Disconnected
2012-06-23 00:45:43,180 DEBUG [cloud.host.Status] (AgentTaskPool-7:null)
Transition:[Resource state = Enabled, Agent event = ShutdownRequested, Host
id = 5, name = kvm1.xxxx.xxx]
2012-06-23 00:45:43,250 DEBUG [cloud.host.Status] (AgentTaskPool-7:null)
Agent status update: [id = 5; name = kvm1.xxxx.xxx; old status = Up; event
= ShutdownRequested; new status = Disconnected; old update count = 71; new
update count = 72]
2012-06-23 00:45:43,250 DEBUG [agent.manager.ClusteredAgentManagerImpl]
(AgentTaskPool-7:null) Notifying other nodes of to disconnect
2012-06-23 00:45:44,825 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-146:null) Seq 1-493813762: Executing request
2012-06-23 00:45:45,176 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-146:null) Seq 1-493813762: Response Received:
2012-06-23 00:45:45,176 DEBUG [agent.transport.Request]
(DirectAgent-146:null) Seq 1-493813762: Processing:  { Ans: , MgmtId:
107158699232, via: 1, Ver: v1, Flags: 10,
[{"ClusterSyncAnswer":{"_clusterId":1,"_newStates":{},"_isExecuted":false,"result":true,"wait":0}}]
}
2012-06-23 00:45:47,234 DEBUG [cloud.async.AsyncJobManagerImpl]
(catalina-exec-11:null) submit async job-19, details: AsyncJobVO {id:19,
userId: 2, accountId: 2, sessionKey: null, instanceType: Host, instanceId:
5, cmd: com.cloud.api.commands.PrepareForMaintenanceCmd, cmdOriginator:
null, cmdInfo:
{"response":"json","id":"81ab632c-1d4e-4bb3-b983-7d1893c48a0b","sessionkey":"1rfcfk3q3RVHbK3/50aEv8jFaK0\u003d","ctxUserId":"2","_":"1340383513748","ctxAccountId":"2","ctxStartEventId":"63"},
cmdVersion: 0, callbackType: 0, callbackAddress: null, status: 0,
processStatus: 0, resultCode: 0, result: null, initMsid: 107158699232,
completeMsid: null, lastUpdated: null, lastPolled: null, created: null}
2012-06-23 00:45:47,237 DEBUG [cloud.async.AsyncJobManagerImpl]
(Job-Executor-19:job-19) Executing
com.cloud.api.commands.PrepareForMaintenanceCmd for job-19
2012-06-23 00:45:47,254 DEBUG [agent.manager.AgentManagerImpl]
(Job-Executor-19:job-19) Can not send command
com.cloud.agent.api.MaintainCommand due to Host 5 is not up
2012-06-23 00:45:47,254 WARN  [cloud.resource.ResourceManagerImpl]
(Job-Executor-19:job-19) Unable to send MaintainCommand to host: 5
2012-06-23 00:45:47,359 DEBUG [cloud.resource.ResourceState]
(Job-Executor-19:job-19) Resource state update: [id = 5; name =
kvm1.xxxxx.xxx; old state = Enabled; event = AdminAskMaintenace; new state
= PrepareForMaintenance]
2012-06-23 00:45:47,459 DEBUG [cloud.resource.ResourceManagerImpl]
(Job-Executor-19:job-19) Sent resource event
EVENT_PREPARE_MAINTENANCE_AFTER to listener
CapacityManagerImpl$$EnhancerByCGLIB$$44863ff4
2012-06-23 00:45:47,474 DEBUG [cloud.async.AsyncJobManagerImpl]
(Job-Executor-19:job-19) Complete async job-19, jobStatus: 1, resultCode:
0, result: com.cloud.api.response.HostResponse@3ece1b58
2012-06-23 00:45:47,534 DEBUG [cloud.async.AsyncJobManagerImpl]
(Job-Executor-19:job-19) Done executing
com.cloud.api.commands.PrepareForMaintenanceCmd for job-19
2012-06-23 00:45:48,895 DEBUG [cloud.server.StatsCollector]
(StatsCollector-3:null) StorageCollector is running...
2012-06-23 00:45:48,975 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(AgentManager-Handler-14:null) Cleanup succeeded. Details null
2012-06-23 00:45:48,975 DEBUG [agent.transport.Request]
(StatsCollector-3:null) Seq 3-361037893: Received:  { Ans: , MgmtId:
107158699232, via: 3, Ver: v1, Flags: 10, { GetStorageStatsAnswer } }
2012-06-23 00:45:48,975 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(StatsCollector-3:null) Cleanup succeeded. Details null
2012-06-23 00:45:48,983 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-324:null) Seq 1-493813943: Executing request
2012-06-23 00:45:49,641 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-324:null) Seq 1-493813943: Response Received:
2012-06-23 00:45:49,642 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(DirectAgent-324:null) Cleanup succeeded. Details null
2012-06-23 00:45:49,642 DEBUG [agent.transport.Request]
(StatsCollector-3:null) Seq 1-493813943: Received:  { Ans: , MgmtId:
107158699232, via: 1, Ver: v1, Flags: 10, { GetStorageStatsAnswer } }
2012-06-23 00:45:49,642 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(StatsCollector-3:null) Cleanup succeeded. Details null
2012-06-23 00:45:50,806 DEBUG [cloud.server.StatsCollector]
(StatsCollector-2:null) VmStatsCollector is running...
2012-06-23 00:45:52,262 DEBUG [cloud.async.AsyncJobManagerImpl]
(catalina-exec-6:null) Async job-19 completed
2012-06-23 00:45:52,572 DEBUG [agent.manager.AgentManagerImpl]
(AgentManager-Handler-11:null) Ping from 4


Thanks,
Alex

On Sat, Jun 23, 2012 at 12:22 AM, Alexey Zilber <alexeyzilber@gmail.com>wrote:

> Hi,
>
>   The saga continues!  I added a KVM host.  The agent decided it wants to
> constantly reboot the server:
>
> 2012-06-23 00:11:32,083{GMT} INFO  [cloud.agent.Agent] (Agent-Handler-2:)
> Startup Response Received: agent id = 5
> 2012-06-23 00:11:32,083 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
> Startup Response Received: agent id = 5
> 2012-06-23 00:12:30,187{GMT} WARN  [resource.computing.KVMHAMonitor]
> (Thread-7:) write heartbeat failed: Failed to create
> /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 0
> 2012-06-23 00:12:30,187 WARN  [resource.computing.KVMHAMonitor]
> (Thread-7:null) write heartbeat failed: Failed to create
> /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 0
> 2012-06-23 00:12:30,209{GMT} WARN  [resource.computing.KVMHAMonitor]
> (Thread-7:) write heartbeat failed: Failed to create
> /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 1
> 2012-06-23 00:12:30,209 WARN  [resource.computing.KVMHAMonitor]
> (Thread-7:null) write heartbeat failed: Failed to create
> /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 1
> 2012-06-23 00:12:30,232{GMT} WARN  [resource.computing.KVMHAMonitor]
> (Thread-7:) write heartbeat failed: Failed to create
> /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 2
> 2012-06-23 00:12:30,232 WARN  [resource.computing.KVMHAMonitor]
> (Thread-7:null) write heartbeat failed: Failed to create
> /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 2
> 2012-06-23 00:12:30,254{GMT} WARN  [resource.computing.KVMHAMonitor]
> (Thread-7:) write heartbeat failed: Failed to create
> /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 3
> 2012-06-23 00:12:30,254 WARN  [resource.computing.KVMHAMonitor]
> (Thread-7:null) write heartbeat failed: Failed to create
> /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 3
> 2012-06-23 00:12:30,275{GMT} WARN  [resource.computing.KVMHAMonitor]
> (Thread-7:) write heartbeat failed: Failed to create
> /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 4
> 2012-06-23 00:12:30,275 WARN  [resource.computing.KVMHAMonitor]
> (Thread-7:null) write heartbeat failed: Failed to create
> /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17, retry: 4
> 2012-06-23 00:12:30,275{GMT} WARN  [resource.computing.KVMHAMonitor]
> (Thread-7:) write heartbeat failed: Failed to create
> /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17; reboot the
> host
> 2012-06-23 00:12:30,275 WARN  [resource.computing.KVMHAMonitor]
> (Thread-7:null) write heartbeat failed: Failed to create
> /mnt/9c2be815-de2b-3c14-84bb-54025d782794/KVMHA//hb-10.1.1.17; reboot the
> host
>
> Broadcast message from root@kvm1.xxxxx.xxxx
>         (unknown) at 0:12 ...
>
> The system is going down for reboot NOW!
>
> It looks like the agent was in fact, at least able to create the initial
> directory:
>
> [root@kvm1 ~]# ls -al /mnt/9c2be815-de2b-3c14-84bb-54025d782794
> total 8
> drwxrwxrwx  2 root root 4096 Jun 22 23:58 .
> drwxr-xr-x. 4 root root 4096 Jun 22 23:58 ..
>
> Here's the agent properties file:
>
> #Storage
> #Sat Jun 23 00:11:32 MYT 2012
> guest.network.device=cloudbr0
> workers=5
> private.network.device=cloudbr0
> port=8250
> resource=com.cloud.agent.resource.computing.LibvirtComputingResource
> pod=1
> zone=1
> guid=0f0f4f5c-99d0-3813-a7a6-00248cdfd17e
> cluster=2
> public.network.device=cloudbr0
> local.storage.uuid=fbefb2ea-f3e0-4f02-96cb-1b8abb6e8c54
> host=10.1.1.18
> LibvirtComputingResource.id=5
>
>
> First time I'm seeing this error...  Last time my kvm setup went well, but
> KVM was my first hypervisor, now it's the second.
>
> Thanks!
> Alex
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message