cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mice Xia" <mice_...@tcloudcomputing.com>
Subject Re: Xen Host failure in pool
Date Fri, 10 Aug 2012 15:44:23 GMT

Did you see something like nics are lost, or cannot be found?
If yes, try xe pool-emergency-reset-master and reboot

Im not in the office, not sure if this is the correct cmd, I remember there are similar ones
but only one works.
Please google xenserver nics disappear or nics lost if it is does not work

Hope this help

Regards
Mice


-----Original Message-----
From: Nik Martin [mailto:nik.martin@nfinausa.com]
Sent: 2012-8-10 (星期五) 23:35
To: cloudstack-users@incubator.apache.org
Subject: Re: Xen Host failure in pool
 
On 08/10/2012 10:32 AM, Mice Xia wrote:
> 
> I remember when network partition happens, pool slave may enter emergency mode and show
offline as it could not reach its master for a long time.
> Could you check hv1's console (graphical console, not ssh console), and check if its
nics are shown correctly?
> 
> Regards
> Mice
> 
No, when I went into the xsconsole and tried to review all the settings,
it was not showing the management interfaces properly.

-- 
Regards,

Nik

> -----Original Message-----
> From: Nik Martin [mailto:nik.martin@nfinausa.com]
> Sent: 2012-8-10 (星期五) 23:04
> To: cloudstack-users@incubator.apache.org
> Subject: Xen Host failure in pool
>   
> We have a Xenserver 6.2 based pool of three hosts running under
> CloudStack Acton release (code base is about two weeks old).  We left
> last night and everything was fine, and I have about 2 VMs running on
> each host, not doing anything. This morning, I came in, and three VMs
> have stopped, and I logged into XenCenter to see what the pool looked
> like, and the Pol master hd changed from host HV3 to HV2, and HV1 was
> offline.  I logged in to HV1's console, and looked at the
> /var/log/messages, and it was complaining about the pool master address
> being wrong. I went into CloudStack UI and deleted and re-added the
> host, and it failed immediately, and I got this in the log when I did:
> 
> 
> 2012-08-10 09:56:39,566 DEBUG [cloud.api.ApiServlet]
> (catalina-exec-24:null) Invalid paramemter in URL found. param: hosttags=
> 2012-08-10 09:56:39,573 INFO  [cloud.resource.ResourceManagerImpl]
> (catalina-exec-24:null) Trying to add a new host at http://172.16.5.3 in
> data center 2
> 2012-08-10 09:56:39,629 DEBUG [xen.resource.XenServerConnectionPool]
> (catalina-exec-24:null) Slave logon to 172.16.5.3
> 2012-08-10 09:56:39,632 DEBUG [xen.resource.XenServerConnectionPool]
> (catalina-exec-24:null) Failed to slave local login to 172.16.5.3 due to
> The master says the host is not known to it. Perhaps the Host was
> deleted from the master's database? Perhaps the slave is pointing to the
> wrong master?
> 2012-08-10 09:56:39,638 DEBUG [xen.discoverer.XcpServerDiscoverer]
> (catalina-exec-24:null) other exceptions: java.lang.RuntimeException:
> can not get master ip
> java.lang.RuntimeException: can not get master ip
> 	at
> com.cloud.hypervisor.xen.resource.XenServerConnectionPool.getMasterIp(XenServerConnectionPool.java:343)
> 	at
> com.cloud.hypervisor.xen.discoverer.XcpServerDiscoverer.find(XcpServerDiscoverer.java:179)
> 	at
> com.cloud.resource.ResourceManagerImpl.discoverHostsFull(ResourceManagerImpl.java:644)
> 	at
> com.cloud.resource.ResourceManagerImpl.discoverHosts(ResourceManagerImpl.java:514)
> 	at com.cloud.api.commands.AddHostCmd.execute(AddHostCmd.java:136)
> 	at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:132)
> 	at com.cloud.api.ApiServer.queueCommand(ApiServer.java:509)
> 	at com.cloud.api.ApiServer.handleRequest(ApiServer.java:416)
> 	at com.cloud.api.ApiServlet.processRequest(ApiServlet.java:300)
> 	at com.cloud.api.ApiServlet.doGet(ApiServlet.java:59)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
> 	at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
> 	at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> 	at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> 	at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> 	at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
> 	at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> 	at
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:555)
> 	at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> 	at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
> 	at
> org.apache.coyote.http11.Http11NioProcessor.process(Http11NioProcessor.java:889)
> 	at
> org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:721)
> 	at
> org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:2268)
> 	at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> 	at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> 	at java.lang.Thread.run(Thread.java:679)
> 2012-08-10 09:56:39,638 WARN  [cloud.resource.ResourceManagerImpl]
> (catalina-exec-24:null) Unable to find the server resources at
> http://172.16.5.3
> 2012-08-10 09:56:39,642 WARN  [api.commands.AddHostCmd]
> (catalina-exec-24:null) Exception:
> com.cloud.exception.DiscoveryException: Unable to add the host
> 	at
> com.cloud.resource.ResourceManagerImpl.discoverHostsFull(ResourceManagerImpl.java:694)
> 	at
> com.cloud.resource.ResourceManagerImpl.discoverHosts(ResourceManagerImpl.java:514)
> 	at com.cloud.api.commands.AddHostCmd.execute(AddHostCmd.java:136)
> 	at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:132)
> 	at com.cloud.api.ApiServer.queueCommand(ApiServer.java:509)
> 	at com.cloud.api.ApiServer.handleRequest(ApiServer.java:416)
> 	at com.cloud.api.ApiServlet.processRequest(ApiServlet.java:300)
> 	at com.cloud.api.ApiServlet.doGet(ApiServlet.java:59)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
> 	at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
> 	at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> 	at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> 	at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> 	at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
> 	at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> 	at
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:555)
> 	at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> 	at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
> 	at
> org.apache.coyote.http11.Http11NioProcessor.process(Http11NioProcessor.java:889)
> 	at
> org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:721)
> 	at
> org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:2268)
> 	at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> 	at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> 	at java.lang.Thread.run(Thread.java:679)
> 2012-08-10 09:56:39,642 WARN  [cloud.api.ApiDispatcher]
> (catalina-exec-24:null) class com.cloud.api.ServerApiException : Unable
> to add the host
> 2012-08-10 09:56:39,723 DEBUG [agent.manager.DirectAgentAttache]
> (DirectAgent-305:null) Ping from 17
> 2012-08-10 09:56:43,822 DEBUG
> [storage.secondary.SecondaryStorageManagerImpl] (secstorage-1:null) Zone
> 2 is ready to launch secondary storage VM
> 2012-08-10 09:56:43,916 DEBUG
> [cloud.consoleproxy.ConsoleProxyManagerImpl] (consoleproxy-1:null) Zone
> 2 is ready to launch console proxy
> 2012-08-10 09:56:44,102 DEBUG
> [network.router.VirtualNetworkApplianceManagerImpl]
> (RouterStatusMonitor-1:null) Found 2 routers.
> 2012-08-10 09:56:44,614 DEBUG [agent.manager.AgentManagerImpl]
> (AgentManager-Handler-12:null) Ping from 22
> 2012-08-10 09:56:48,864 DEBUG [agent.manager.AgentManagerImpl]
> (AgentManager-Handler-10:null) Ping from 18
> 2012-08-10 09:56:49,511 DEBUG [cloud.server.StatsCollector]
> (StatsCollector-1:null) VmStatsCollector is running...
> 2012-08-10 09:56:49,525 DEBUG [agent.manager.DirectAgentAttache]
> (DirectAgent-305:null) Seq 16-92408948: Executing request
> 2012-08-10 09:56:49,763 DEBUG [xen.resource.CitrixResourceBase]
> (DirectAgent-305:null) Vm cpu utilization 0.01
> 2012-08-10 09:56:49,763 DEBUG [agent.manager.DirectAgentAttache]
> (DirectAgent-305:null) Seq 16-92408948: Response Received:
> 2012-08-10 09:56:49,763 DEBUG [cloud.vm.VirtualMachineManagerImpl]
> (DirectAgent-305:null) Cleanup succeeded. Details null
> 2012-08-10 09:56:49,763 DEBUG [agent.transport.Request]
> (StatsCollector-1:null) Seq 16-92408948: Received:  { Ans: , MgmtId:
> 130577622632, via: 16, Ver: v1, Flags: 10, { GetVmStatsAnswer } }
> 2012-08-10 09:56:49,763 DEBUG [cloud.vm.VirtualMachineManagerImpl]
> (StatsCollector-1:null) Cleanup succeeded. Details null
> 2012-08-10 09:56:54,411 DEBUG [agent.manager.DirectAgentAttache]
> (DirectAgent-497:null) Ping from 17
> 2012-08-10 09:56:54,550 DEBUG [agent.manager.DirectAgentAttache]
> (DirectAgent-338:null) Ping from 16
> 2012-08-10 09:56:59,614 DEBUG [agent.manager.AgentManagerImpl]
> (AgentManager-Handler-8:null) Ping from 22
> 2012-08-10 09:57:03,864 DEBUG [agent.manager.AgentManagerImpl]
> (AgentManager-Handler-9:null) Ping from 18
> 2012-08-10 09:57:09,551 DEBUG [agent.manager.DirectAgentAttache]
> (DirectAgent-71:null) Ping from 16
> 2012-08-10 09:57:09,669 DEBUG [agent.manager.DirectAgentAttache]
> (DirectAgent-338:null) Ping from 17
> 2012-08-10 09:57:13,821 DEBUG
> [storage.secondary.SecondaryStorageManagerImpl] (secstorage-1:null) Zone
> 2 is ready to launch secondary storage VM
> 2012-08-10 09:57:13,918 DEBUG
> [cloud.consoleproxy.ConsoleProxyManagerImpl] (consoleproxy-1:null) Zone
> 2 is ready to launch console proxy
> 2012-08-10 09:57:14,102 DEBUG
> [network.router.VirtualNetworkApplianceManagerImpl]
> (RouterStatusMonitor-1:null) Found 2 routers.
> 2012-08-10 09:57:14,614 DEBUG [agent.manager.AgentManagerImpl]
> (AgentManager-Handler-11:null) Ping from 22
> 2012-08-10 09:57:15,645 DEBUG [cloud.server.StatsCollector]
> (StatsCollector-3:null) HostStatsCollector is running...
> 2012-08-10 09:57:15,656 DEBUG [agent.manager.DirectAgentAttache]
> (DirectAgent-71:null) Seq 16-92408949: Executing request
> 2012-08-10 09:57:15,878 DEBUG [agent.manager.DirectAgentAttache]
> (DirectAgent-71:null) Seq 16-92408949: Response Received:
> 2012-08-10 09:57:15,878 DEBUG [cloud.vm.VirtualMachineManagerImpl]
> (DirectAgent-71:null) Cleanup succeeded. Details null
> 2012-08-10 09:57:15,878 DEBUG [agent.transport.Request]
> (StatsCollector-3:null) Seq 16-92408949: Received:  { Ans: , MgmtId:
> 130577622632, via: 16, Ver: v1, Flags: 10, { GetHostStatsAnswer } }
> 2012-08-10 09:57:15,879 DEBUG [cloud.vm.VirtualMachineManagerImpl]
> (StatsCollector-3:null) Cleanup succeeded. Details null
> 2012-08-10 09:57:15,884 DEBUG [agent.manager.DirectAgentAttache]
> (DirectAgent-338:null) Seq 17-665190891: Executing request
> 2012-08-10 09:57:16,312 DEBUG [agent.manager.DirectAgentAttache]
> (DirectAgent-338:null) Seq 17-665190891: Response Received:
> 2012-08-10 09:57:16,312 DEBUG [cloud.vm.VirtualMachineManagerImpl]
> (DirectAgent-338:null) Cleanup succeeded. Details null
> 2012-08-10 09:57:16,312 DEBUG [agent.transport.Request]
> (StatsCollector-3:null) Seq 17-665190891: Received:  { Ans: , MgmtId:
> 130577622632, via: 17, Ver: v1, Flags: 10, { GetHostStatsAnswer } }
> 2012-08-10 09:57:16,313 DEBUG [cloud.vm.VirtualMachineManagerImpl]
> (StatsCollector-3:null) Cleanup succeeded. Details null
> 2012-08-10 09:57:18,864 DEBUG [agent.manager.AgentManagerImpl]
> (AgentManager-Handler-15:null) Ping from 18
> 2012-08-10 09:57:24,407 DEBUG [agent.manager.DirectAgentAttache]
> (DirectAgent-71:null) Ping from 17
> 2012-08-10 09:57:24,566 DEBUG [agent.manager.DirectAgentAttache]
> (DirectAgent-338:null) Ping from 16
> 2012-08-10 09:57:29,615 DEBUG [agent.manager.AgentManagerImpl]
> (AgentManager-Handler-1:null) Ping from 22
> 2012-08-10 09:57:30,047 DEBUG [agent.manager.DirectAgentAttache]
> (DirectAgent-294:null) Seq 16-92405762: Executing request
> 2012-08-10 09:57:30,308 DEBUG [agent.manager.DirectAgentAttache]
> (DirectAgent-294:null) Seq 16-92405762: Response Received:
> 2012-08-10 09:57:30,308 DEBUG [agent.transport.Request]
> (DirectAgent-294:null) Seq 16-92405762: Processing:  { Ans: , MgmtId:
> 130577622632, via: 16, Ver: v1, Flags: 10,
> [{"ClusterSyncAnswer":{"_clusterId":1,"_newStates":{},"_isExecuted":false,"result":true,"wait":0}}]
> }
> 2012-08-10 09:57:31,060 DEBUG [agent.manager.DirectAgentAttache]
> (DirectAgent-357:null) Seq 17-665190402: Executing request
> 2012-08-10 09:57:31,250 DEBUG [agent.manager.DirectAgentAttache]
> (DirectAgent-357:null) Seq 17-665190402: Response Received:
> 2012-08-10 09:57:31,250 DEBUG [agent.transport.Request]
> (DirectAgent-357:null) Seq 17-665190402: Processing:  { Ans: , MgmtId:
> 130577622632, via: 17, Ver: v1, Flags: 10,
> [{"Answer":{"result":true,"wait":0}}] }
> 
> This is a very serious error, and I don't know how to fix it.  Can
> anyone suggest what might be the problem and hos I might fix it?
> 
> 





Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message