cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Caleb Call <>
Subject Xenserver Host unable to reconnect
Date Mon, 11 Feb 2013 22:17:50 GMT
We have a zone that has a single host in it.  We also recently updated to 4.0 from 3.0.2 (this
may not be relevant but figured I'd mention it anyways).  We put our host in maintenance mode
(all VMs were shutdown, etc) and applied some patches that were waiting to be applied.  After
coming back up, it now is unable to reconnect, when I try to force reconnect, I get the following
in the management log:

2013-02-11 15:04:34,541 DEBUG [] (catalina-exec-19:null) UserDaoCache:
UserDaoMemoryStore hit for 10
2013-02-11 15:04:34,578 DEBUG [cloud.async.AsyncJobManagerImpl] (catalina-exec-19:null) submit
async job-4806, details: AsyncJobVO {id:4806, userId: 10, accountId: 7, sessionKey: null,
instanceType: Host, instanceId: 25, cmd:, cmdOriginator:
null, cmdInfo: {"id":"6bc87ba4-52d4-4477-a417-46886d03698d","response":"json","sessionkey":"6IzB5H0fVA9f9FgdWtbNG9GdB5E\u003d","ctxUserId":"10","_":"1360620274351","ctxAccountId":"7","ctxStartEventId":"15461"},
cmdVersion: 0, callbackType: 0, callbackAddress: null, status: 0, processStatus: 0, resultCode:
0, result: null, initMsid: 145320940120008, completeMsid: null, lastUpdated: null, lastPolled:
null, created: null}
2013-02-11 15:04:34,579 DEBUG [cloud.async.AsyncJobManagerImpl] (Job-Executor-3:job-4806)
Executing for job-4806
2013-02-11 15:04:34,587 INFO  [agent.manager.AgentManagerImpl] (Job-Executor-3:job-4806) Unable
to disconnect host because it is not connected to this server: 25
2013-02-11 15:04:34,587 WARN  [api.commands.ReconnectHostCmd] (Job-Executor-3:job-4806) Exception:
        at java.util.concurrent.Executors$
        at java.util.concurrent.FutureTask$Sync.innerRun(
        at java.util.concurrent.ThreadPoolExecutor.runWorker(
        at java.util.concurrent.ThreadPoolExecutor$
2013-02-11 15:04:34,587 WARN  [cloud.api.ApiDispatcher] (Job-Executor-3:job-4806) class
: null
2013-02-11 15:04:34,588 DEBUG [cloud.async.AsyncJobManagerImpl] (Job-Executor-3:job-4806)
Complete async job-4806, jobStatus: 2, resultCode: 530, result: Error Code: 534 Error text:
2013-02-11 15:04:39,624 DEBUG [] (catalina-exec-17:null) UserDaoCache:
UserDaoMemoryStore hit for 10
2013-02-11 15:04:39,635 DEBUG [cloud.async.AsyncJobManagerImpl] (catalina-exec-17:null) Async
job-4806 completed

I can't find in the logs where it's trying (besides the force reconnect) to reconnect on it's
own.  I do see where it acknowledges the state of Alert for the host, but doesn't give any
reasoning as to why.

The only thing I can see any indication it's even trying is this line:

2013-02-11 11:47:05,670 DEBUG [xen.resource.XenServerConnectionPool] (ClusteredAgentManager
Timer:null) Failed to slave local login to
2013-02-11 11:47:05,671 WARN  [cloud.resource.DiscovererBase] (ClusteredAgentManager Timer:null)
Unable to configure resource due to Can not create slave connection to is the host that should be reconnecting but is not.

Anything else I can look at as to why it's not connecting?  Any suggestions on why my host
won't reconnect?



CONFIDENTIALITY NOTICE: This message is intended only for the use and review of the individual
or entity to which it is addressed and may contain information that is privileged and confidential.
If the reader of this message is not the intended recipient, or the employee or agent responsible
for delivering the message solely to the intended recipient, you are hereby notified that
any dissemination, distribution or copying of this communication is strictly prohibited. If
you have received this communication in error, please notify sender immediately by telephone
or return email. Thank you.

View raw message