cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmad Emneina <aemne...@gmail.com>
Subject Re: Xenserver Host unable to reconnect
Date Mon, 11 Feb 2013 22:23:53 GMT
from the management server, can you ssh to that host? can you execute xe
commands on that host? if yes to both those, null out the mgmt_server_id
from your host in the host table... then issue the force reconnect. see if
that helps.


On Mon, Feb 11, 2013 at 2:17 PM, Caleb Call <ccall@overstock.com> wrote:

> We have a zone that has a single host in it.  We also recently updated to
> 4.0 from 3.0.2 (this may not be relevant but figured I'd mention it
> anyways).  We put our host in maintenance mode (all VMs were shutdown, etc)
> and applied some patches that were waiting to be applied.  After coming
> back up, it now is unable to reconnect, when I try to force reconnect, I
> get the following in the management log:
>
> 2013-02-11 15:04:34,541 DEBUG [ehcache.store.MemoryStore]
> (catalina-exec-19:null) UserDaoCache: UserDaoMemoryStore hit for 10
> 2013-02-11 15:04:34,578 DEBUG [cloud.async.AsyncJobManagerImpl]
> (catalina-exec-19:null) submit async job-4806, details: AsyncJobVO
> {id:4806, userId: 10, accountId: 7, sessionKey: null, instanceType: Host,
> instanceId: 25, cmd: com.cloud.api.commands.ReconnectHostCmd,
> cmdOriginator: null, cmdInfo:
> {"id":"6bc87ba4-52d4-4477-a417-46886d03698d","response":"json","sessionkey":"6IzB5H0fVA9f9FgdWtbNG9GdB5E\u003d","ctxUserId":"10","_":"1360620274351","ctxAccountId":"7","ctxStartEventId":"15461"},
> cmdVersion: 0, callbackType: 0, callbackAddress: null, status: 0,
> processStatus: 0, resultCode: 0, result: null, initMsid: 145320940120008,
> completeMsid: null, lastUpdated: null, lastPolled: null, created: null}
> 2013-02-11 15:04:34,579 DEBUG [cloud.async.AsyncJobManagerImpl]
> (Job-Executor-3:job-4806) Executing com.cloud.api.commands.ReconnectHostCmd
> for job-4806
> 2013-02-11 15:04:34,587 INFO  [agent.manager.AgentManagerImpl]
> (Job-Executor-3:job-4806) Unable to disconnect host because it is not
> connected to this server: 25
> 2013-02-11 15:04:34,587 WARN  [api.commands.ReconnectHostCmd]
> (Job-Executor-3:job-4806) Exception:
> com.cloud.api.ServerApiException
>         at
> com.cloud.api.commands.ReconnectHostCmd.execute(ReconnectHostCmd.java:108)
>         at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:138)
>         at
> com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:432)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         at java.lang.Thread.run(Thread.java:679)
> 2013-02-11 15:04:34,587 WARN  [cloud.api.ApiDispatcher]
> (Job-Executor-3:job-4806) class com.cloud.api.ServerApiException : null
> 2013-02-11 15:04:34,588 DEBUG [cloud.async.AsyncJobManagerImpl]
> (Job-Executor-3:job-4806) Complete async job-4806, jobStatus: 2,
> resultCode: 530, result: Error Code: 534 Error text: null
> 2013-02-11 15:04:39,624 DEBUG [ehcache.store.MemoryStore]
> (catalina-exec-17:null) UserDaoCache: UserDaoMemoryStore hit for 10
> 2013-02-11 15:04:39,635 DEBUG [cloud.async.AsyncJobManagerImpl]
> (catalina-exec-17:null) Async job-4806 completed
>
>
> I can't find in the logs where it's trying (besides the force reconnect)
> to reconnect on it's own.  I do see where it acknowledges the state of
> Alert for the host, but doesn't give any reasoning as to why.
>
> The only thing I can see any indication it's even trying is this line:
>
> 2013-02-11 11:47:05,670 DEBUG [xen.resource.XenServerConnectionPool]
> (ClusteredAgentManager Timer:null) Failed to slave local login to 10.5.1.14
> 2013-02-11 11:47:05,671 WARN  [cloud.resource.DiscovererBase]
> (ClusteredAgentManager Timer:null) Unable to configure resource due to Can
> not create slave connection to 10.5.1.14
>
> 10.5.1.14 is the host that should be reconnecting but is not.
>
> Anything else I can look at as to why it's not connecting?  Any
> suggestions on why my host won't reconnect?
>
> Thanks
>
>
> ________________________________
>
> CONFIDENTIALITY NOTICE: This message is intended only for the use and
> review of the individual or entity to which it is addressed and may contain
> information that is privileged and confidential. If the reader of this
> message is not the intended recipient, or the employee or agent responsible
> for delivering the message solely to the intended recipient, you are hereby
> notified that any dissemination, distribution or copying of this
> communication is strictly prohibited. If you have received this
> communication in error, please notify sender immediately by telephone or
> return email. Thank you.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message