cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anthony Xu <Xuefei...@citrix.com>
Subject RE: Xen Host failure in pool
Date Fri, 10 Aug 2012 17:59:01 GMT
Hi Nik

What's the network configuration in XenServer host?
Is it bridge or openvswitch?

You can get the info by
Cat /etc/xensource/network.conf

Anthony

> -----Original Message-----
> From: Nik Martin [mailto:nik.martin@nfinausa.com]
> Sent: Friday, August 10, 2012 8:36 AM
> To: cloudstack-users@incubator.apache.org
> Subject: Re: Xen Host failure in pool
> 
> On 08/10/2012 10:32 AM, Mice Xia wrote:
> >
> > I remember when network partition happens, pool slave may enter
> emergency mode and show offline as it could not reach its master for a
> long time.
> > Could you check hv1's console (graphical console, not ssh console),
> and check if its nics are shown correctly?
> >
> > Regards
> > Mice
> >
> No, when I went into the xsconsole and tried to review all the settings,
> it was not showing the management interfaces properly.
> 
> --
> Regards,
> 
> Nik
> 
> > -----Original Message-----
> > From: Nik Martin [mailto:nik.martin@nfinausa.com]
> > Sent: 2012-8-10 (ζ˜ŸζœŸδΊ”) 23:04
> > To: cloudstack-users@incubator.apache.org
> > Subject: Xen Host failure in pool
> >
> > We have a Xenserver 6.2 based pool of three hosts running under
> > CloudStack Acton release (code base is about two weeks old).  We left
> > last night and everything was fine, and I have about 2 VMs running on
> > each host, not doing anything. This morning, I came in, and three VMs
> > have stopped, and I logged into XenCenter to see what the pool looked
> > like, and the Pol master hd changed from host HV3 to HV2, and HV1 was
> > offline.  I logged in to HV1's console, and looked at the
> > /var/log/messages, and it was complaining about the pool master
> address
> > being wrong. I went into CloudStack UI and deleted and re-added the
> > host, and it failed immediately, and I got this in the log when I did:
> >
> >
> > 2012-08-10 09:56:39,566 DEBUG [cloud.api.ApiServlet]
> > (catalina-exec-24:null) Invalid paramemter in URL found. param:
> hosttags=
> > 2012-08-10 09:56:39,573 INFO  [cloud.resource.ResourceManagerImpl]
> > (catalina-exec-24:null) Trying to add a new host at http://172.16.5.3
> in
> > data center 2
> > 2012-08-10 09:56:39,629 DEBUG [xen.resource.XenServerConnectionPool]
> > (catalina-exec-24:null) Slave logon to 172.16.5.3
> > 2012-08-10 09:56:39,632 DEBUG [xen.resource.XenServerConnectionPool]
> > (catalina-exec-24:null) Failed to slave local login to 172.16.5.3 due
> to
> > The master says the host is not known to it. Perhaps the Host was
> > deleted from the master's database? Perhaps the slave is pointing to
> the
> > wrong master?
> > 2012-08-10 09:56:39,638 DEBUG [xen.discoverer.XcpServerDiscoverer]
> > (catalina-exec-24:null) other exceptions: java.lang.RuntimeException:
> > can not get master ip
> > java.lang.RuntimeException: can not get master ip
> > 	at
> >
> com.cloud.hypervisor.xen.resource.XenServerConnectionPool.getMasterIp(X
> enServerConnectionPool.java:343)
> > 	at
> >
> com.cloud.hypervisor.xen.discoverer.XcpServerDiscoverer.find(XcpServerD
> iscoverer.java:179)
> > 	at
> >
> com.cloud.resource.ResourceManagerImpl.discoverHostsFull(ResourceManage
> rImpl.java:644)
> > 	at
> >
> com.cloud.resource.ResourceManagerImpl.discoverHosts(ResourceManagerImp
> l.java:514)
> > 	at com.cloud.api.commands.AddHostCmd.execute(AddHostCmd.java:136)
> > 	at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:132)
> > 	at com.cloud.api.ApiServer.queueCommand(ApiServer.java:509)
> > 	at com.cloud.api.ApiServer.handleRequest(ApiServer.java:416)
> > 	at com.cloud.api.ApiServlet.processRequest(ApiServlet.java:300)
> > 	at com.cloud.api.ApiServlet.doGet(ApiServlet.java:59)
> > 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
> > 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
> > 	at
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applic
> ationFilterChain.java:290)
> > 	at
> >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFil
> terChain.java:206)
> > 	at
> >
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperVal
> ve.java:233)
> > 	at
> >
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextVal
> ve.java:191)
> > 	at
> >
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.jav
> a:127)
> > 	at
> >
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.jav
> a:102)
> > 	at
> >
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:55
> 5)
> > 	at
> >
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve
> .java:109)
> > 	at
> >
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:
> 298)
> > 	at
> >
> org.apache.coyote.http11.Http11NioProcessor.process(Http11NioProcessor.
> java:889)
> > 	at
> >
> org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.proc
> ess(Http11NioProtocol.java:721)
> > 	at
> >
> org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.
> java:2268)
> > 	at
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.ja
> va:1110)
> > 	at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.j
> ava:603)
> > 	at java.lang.Thread.run(Thread.java:679)
> > 2012-08-10 09:56:39,638 WARN  [cloud.resource.ResourceManagerImpl]
> > (catalina-exec-24:null) Unable to find the server resources at
> > http://172.16.5.3
> > 2012-08-10 09:56:39,642 WARN  [api.commands.AddHostCmd]
> > (catalina-exec-24:null) Exception:
> > com.cloud.exception.DiscoveryException: Unable to add the host
> > 	at
> >
> com.cloud.resource.ResourceManagerImpl.discoverHostsFull(ResourceManage
> rImpl.java:694)
> > 	at
> >
> com.cloud.resource.ResourceManagerImpl.discoverHosts(ResourceManagerImp
> l.java:514)
> > 	at com.cloud.api.commands.AddHostCmd.execute(AddHostCmd.java:136)
> > 	at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:132)
> > 	at com.cloud.api.ApiServer.queueCommand(ApiServer.java:509)
> > 	at com.cloud.api.ApiServer.handleRequest(ApiServer.java:416)
> > 	at com.cloud.api.ApiServlet.processRequest(ApiServlet.java:300)
> > 	at com.cloud.api.ApiServlet.doGet(ApiServlet.java:59)
> > 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
> > 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
> > 	at
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applic
> ationFilterChain.java:290)
> > 	at
> >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFil
> terChain.java:206)
> > 	at
> >
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperVal
> ve.java:233)
> > 	at
> >
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextVal
> ve.java:191)
> > 	at
> >
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.jav
> a:127)
> > 	at
> >
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.jav
> a:102)
> > 	at
> >
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:55
> 5)
> > 	at
> >
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve
> .java:109)
> > 	at
> >
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:
> 298)
> > 	at
> >
> org.apache.coyote.http11.Http11NioProcessor.process(Http11NioProcessor.
> java:889)
> > 	at
> >
> org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.proc
> ess(Http11NioProtocol.java:721)
> > 	at
> >
> org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.
> java:2268)
> > 	at
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.ja
> va:1110)
> > 	at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.j
> ava:603)
> > 	at java.lang.Thread.run(Thread.java:679)
> > 2012-08-10 09:56:39,642 WARN  [cloud.api.ApiDispatcher]
> > (catalina-exec-24:null) class com.cloud.api.ServerApiException :
> Unable
> > to add the host
> > 2012-08-10 09:56:39,723 DEBUG [agent.manager.DirectAgentAttache]
> > (DirectAgent-305:null) Ping from 17
> > 2012-08-10 09:56:43,822 DEBUG
> > [storage.secondary.SecondaryStorageManagerImpl] (secstorage-1:null)
> Zone
> > 2 is ready to launch secondary storage VM
> > 2012-08-10 09:56:43,916 DEBUG
> > [cloud.consoleproxy.ConsoleProxyManagerImpl] (consoleproxy-1:null)
> Zone
> > 2 is ready to launch console proxy
> > 2012-08-10 09:56:44,102 DEBUG
> > [network.router.VirtualNetworkApplianceManagerImpl]
> > (RouterStatusMonitor-1:null) Found 2 routers.
> > 2012-08-10 09:56:44,614 DEBUG [agent.manager.AgentManagerImpl]
> > (AgentManager-Handler-12:null) Ping from 22
> > 2012-08-10 09:56:48,864 DEBUG [agent.manager.AgentManagerImpl]
> > (AgentManager-Handler-10:null) Ping from 18
> > 2012-08-10 09:56:49,511 DEBUG [cloud.server.StatsCollector]
> > (StatsCollector-1:null) VmStatsCollector is running...
> > 2012-08-10 09:56:49,525 DEBUG [agent.manager.DirectAgentAttache]
> > (DirectAgent-305:null) Seq 16-92408948: Executing request
> > 2012-08-10 09:56:49,763 DEBUG [xen.resource.CitrixResourceBase]
> > (DirectAgent-305:null) Vm cpu utilization 0.01
> > 2012-08-10 09:56:49,763 DEBUG [agent.manager.DirectAgentAttache]
> > (DirectAgent-305:null) Seq 16-92408948: Response Received:
> > 2012-08-10 09:56:49,763 DEBUG [cloud.vm.VirtualMachineManagerImpl]
> > (DirectAgent-305:null) Cleanup succeeded. Details null
> > 2012-08-10 09:56:49,763 DEBUG [agent.transport.Request]
> > (StatsCollector-1:null) Seq 16-92408948: Received:  { Ans: , MgmtId:
> > 130577622632, via: 16, Ver: v1, Flags: 10, { GetVmStatsAnswer } }
> > 2012-08-10 09:56:49,763 DEBUG [cloud.vm.VirtualMachineManagerImpl]
> > (StatsCollector-1:null) Cleanup succeeded. Details null
> > 2012-08-10 09:56:54,411 DEBUG [agent.manager.DirectAgentAttache]
> > (DirectAgent-497:null) Ping from 17
> > 2012-08-10 09:56:54,550 DEBUG [agent.manager.DirectAgentAttache]
> > (DirectAgent-338:null) Ping from 16
> > 2012-08-10 09:56:59,614 DEBUG [agent.manager.AgentManagerImpl]
> > (AgentManager-Handler-8:null) Ping from 22
> > 2012-08-10 09:57:03,864 DEBUG [agent.manager.AgentManagerImpl]
> > (AgentManager-Handler-9:null) Ping from 18
> > 2012-08-10 09:57:09,551 DEBUG [agent.manager.DirectAgentAttache]
> > (DirectAgent-71:null) Ping from 16
> > 2012-08-10 09:57:09,669 DEBUG [agent.manager.DirectAgentAttache]
> > (DirectAgent-338:null) Ping from 17
> > 2012-08-10 09:57:13,821 DEBUG
> > [storage.secondary.SecondaryStorageManagerImpl] (secstorage-1:null)
> Zone
> > 2 is ready to launch secondary storage VM
> > 2012-08-10 09:57:13,918 DEBUG
> > [cloud.consoleproxy.ConsoleProxyManagerImpl] (consoleproxy-1:null)
> Zone
> > 2 is ready to launch console proxy
> > 2012-08-10 09:57:14,102 DEBUG
> > [network.router.VirtualNetworkApplianceManagerImpl]
> > (RouterStatusMonitor-1:null) Found 2 routers.
> > 2012-08-10 09:57:14,614 DEBUG [agent.manager.AgentManagerImpl]
> > (AgentManager-Handler-11:null) Ping from 22
> > 2012-08-10 09:57:15,645 DEBUG [cloud.server.StatsCollector]
> > (StatsCollector-3:null) HostStatsCollector is running...
> > 2012-08-10 09:57:15,656 DEBUG [agent.manager.DirectAgentAttache]
> > (DirectAgent-71:null) Seq 16-92408949: Executing request
> > 2012-08-10 09:57:15,878 DEBUG [agent.manager.DirectAgentAttache]
> > (DirectAgent-71:null) Seq 16-92408949: Response Received:
> > 2012-08-10 09:57:15,878 DEBUG [cloud.vm.VirtualMachineManagerImpl]
> > (DirectAgent-71:null) Cleanup succeeded. Details null
> > 2012-08-10 09:57:15,878 DEBUG [agent.transport.Request]
> > (StatsCollector-3:null) Seq 16-92408949: Received:  { Ans: , MgmtId:
> > 130577622632, via: 16, Ver: v1, Flags: 10, { GetHostStatsAnswer } }
> > 2012-08-10 09:57:15,879 DEBUG [cloud.vm.VirtualMachineManagerImpl]
> > (StatsCollector-3:null) Cleanup succeeded. Details null
> > 2012-08-10 09:57:15,884 DEBUG [agent.manager.DirectAgentAttache]
> > (DirectAgent-338:null) Seq 17-665190891: Executing request
> > 2012-08-10 09:57:16,312 DEBUG [agent.manager.DirectAgentAttache]
> > (DirectAgent-338:null) Seq 17-665190891: Response Received:
> > 2012-08-10 09:57:16,312 DEBUG [cloud.vm.VirtualMachineManagerImpl]
> > (DirectAgent-338:null) Cleanup succeeded. Details null
> > 2012-08-10 09:57:16,312 DEBUG [agent.transport.Request]
> > (StatsCollector-3:null) Seq 17-665190891: Received:  { Ans: , MgmtId:
> > 130577622632, via: 17, Ver: v1, Flags: 10, { GetHostStatsAnswer } }
> > 2012-08-10 09:57:16,313 DEBUG [cloud.vm.VirtualMachineManagerImpl]
> > (StatsCollector-3:null) Cleanup succeeded. Details null
> > 2012-08-10 09:57:18,864 DEBUG [agent.manager.AgentManagerImpl]
> > (AgentManager-Handler-15:null) Ping from 18
> > 2012-08-10 09:57:24,407 DEBUG [agent.manager.DirectAgentAttache]
> > (DirectAgent-71:null) Ping from 17
> > 2012-08-10 09:57:24,566 DEBUG [agent.manager.DirectAgentAttache]
> > (DirectAgent-338:null) Ping from 16
> > 2012-08-10 09:57:29,615 DEBUG [agent.manager.AgentManagerImpl]
> > (AgentManager-Handler-1:null) Ping from 22
> > 2012-08-10 09:57:30,047 DEBUG [agent.manager.DirectAgentAttache]
> > (DirectAgent-294:null) Seq 16-92405762: Executing request
> > 2012-08-10 09:57:30,308 DEBUG [agent.manager.DirectAgentAttache]
> > (DirectAgent-294:null) Seq 16-92405762: Response Received:
> > 2012-08-10 09:57:30,308 DEBUG [agent.transport.Request]
> > (DirectAgent-294:null) Seq 16-92405762: Processing:  { Ans: , MgmtId:
> > 130577622632, via: 16, Ver: v1, Flags: 10,
> >
> [{"ClusterSyncAnswer":{"_clusterId":1,"_newStates":{},"_isExecuted":fal
> se,"result":true,"wait":0}}]
> > }
> > 2012-08-10 09:57:31,060 DEBUG [agent.manager.DirectAgentAttache]
> > (DirectAgent-357:null) Seq 17-665190402: Executing request
> > 2012-08-10 09:57:31,250 DEBUG [agent.manager.DirectAgentAttache]
> > (DirectAgent-357:null) Seq 17-665190402: Response Received:
> > 2012-08-10 09:57:31,250 DEBUG [agent.transport.Request]
> > (DirectAgent-357:null) Seq 17-665190402: Processing:  { Ans: , MgmtId:
> > 130577622632, via: 17, Ver: v1, Flags: 10,
> > [{"Answer":{"result":true,"wait":0}}] }
> >
> > This is a very serious error, and I don't know how to fix it.  Can
> > anyone suggest what might be the problem and hos I might fix it?
> >
> >
> 
> 

Mime
View raw message