cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Leeno Jose.P.A" <leeno...@gmail.com>
Subject Re: Rebuilding management server
Date Wed, 17 Jul 2013 05:29:40 GMT
Hi Ahmad,

Thanks for the mail.
Telnet failed on port 443 to CS from XS hosts. I did not see any service
listening on 443 on CS management server. Something wrong with management
server?

Thanks
Leeno


On Tue, Jul 16, 2013 at 11:50 PM, Ahmad Emneina <aemneina@gmail.com> wrote:

> can you check if your hosts can connect back to the management server
> (ping, telnet 22,443)? there might be some firewall rules in place, or
> routing issues, preventing this.
>
>
> On Tue, Jul 16, 2013 at 9:01 AM, Leeno Jose.P.A <leenojos@gmail.com>
> wrote:
>
> > CS startup logs,
> >
> > 2013-07-16 11:25:30,702 INFO  [utils.component.ComponentContext]
> > (Timer-1:null) Starting
> >
> com.cloud.network.guru.NiciraNvpGuestNetworkGuru_EnhancerByCloudStack_1f6b4bb6
> > 2013-07-16 11:25:30,702 INFO  [utils.component.ComponentContext]
> > (Timer-1:null) Starting
> > com.cloud.server.ManagementServerImpl_EnhancerByCloudStack_d54e1bb1
> > 2013-07-16 11:25:30,702 INFO  [cloud.server.ManagementServerImpl]
> > (Timer-1:null) Startup CloudStack management server...
> > 2013-07-16 11:25:30,707 INFO
> > [cloud.cluster.ClusterServiceServletContainer] (Thread-18:null) Cluster
> > service servlet container listening on port 9090
> > 2013-07-16 11:25:31,832 DEBUG [utils.db.ConnectionConcierge]
> > (Cluster-Heartbeat-1:null) Registering a database connection for
> > ClusterManagerHeartBeat2
> > 2013-07-16 11:25:31,845 INFO  [cloud.cluster.ClusterManagerImpl]
> > (Cluster-Heartbeat-1:null) We are good, no orphan management server msid
> in
> > host table is found
> > 2013-07-16 11:25:31,845 INFO  [cloud.cluster.ClusterManagerImpl]
> > (Cluster-Heartbeat-1:null) Found 1 inactive management server node based
> on
> > timestamp
> > 2013-07-16 11:25:31,846 INFO  [cloud.cluster.ClusterManagerImpl]
> > (Cluster-Heartbeat-1:null) management server node msid: 130602634328,
> name:
> > cstagcms, service ip: 192.168.10.251, version: 4.1.0
> > 2013-07-16 11:25:31,846 INFO  [cloud.cluster.ClusterManagerImpl]
> > (Cluster-Heartbeat-1:null) Trying to connect to 192.168.10.251
> > 2013-07-16 11:25:31,860 DEBUG [cloud.cluster.ClusterManagerImpl]
> > (Cluster-Heartbeat-1:null) Detected management node joined, id:2,
> > nodeIP:192.168.10.251
> > 2013-07-16 11:25:33,348 DEBUG [cloud.cluster.ClusterManagerImpl]
> > (Cluster-Notification-1:null) Notify management server node join to
> > listeners.
> > 2013-07-16 11:25:33,349 DEBUG [cloud.cluster.ClusterManagerImpl]
> > (Cluster-Notification-1:null) Joining node, IP: 192.168.10.251, msid:
> > 81375086018793
> > 2013-07-16 11:25:33,350 DEBUG [cloud.alert.ClusterAlertAdapter]
> > (Cluster-Notification-1:null) Receive cluster alert, EventArgs:
> > com.cloud.cluster.ClusterNodeJoinEventArgs
> > 2013-07-16 11:25:33,350 DEBUG [cloud.alert.ClusterAlertAdapter]
> > (Cluster-Notification-1:null) Handle cluster node join alert, joined
> node:
> > 192.168.10.251, msidL: 81375086018793
> > 2013-07-16 11:25:33,350 DEBUG [cloud.alert.ClusterAlertAdapter]
> > (Cluster-Notification-1:null) Management server node 192.168.10.251 is
> up,
> > send alert
> > 2013-07-16 11:25:33,361 WARN  [cloud.cluster.ClusterManagerImpl]
> > (Cluster-Notification-1:null) Notifying management server join event took
> > 12 ms
> > 2013-07-16 11:25:45,450 DEBUG [cloud.server.StatsCollector]
> > (StatsCollector-2:null) HostStatsCollector is running...
> > 2013-07-16 11:25:45,452 DEBUG [cloud.server.StatsCollector]
> > (StatsCollector-1:null) VmStatsCollector is running...
> > 2013-07-16 11:25:45,467 DEBUG [cloud.server.StatsCollector]
> > (StatsCollector-3:null) StorageCollector is running...
> > 2013-07-16 11:25:45,498 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> > (StatsCollector-2:null) create forwarding ClusteredAgentAttache for 39
> > 2013-07-16 11:25:45,491 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> > (StatsCollector-3:null) create forwarding ClusteredAgentAttache for 50
> > 2013-07-16 11:25:45,751 INFO  [agent.manager.ClusteredAgentManagerImpl]
> > (StatsCollector-3:null) SSL: Handshake done
> > 2013-07-16 11:25:45,752 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> > (StatsCollector-3:null) Connection to peer opened: 130602634328, ip:
> > 192.168.10.251
> > 2013-07-16 11:25:45,757 DEBUG [agent.manager.ClusteredAgentAttache]
> > (StatsCollector-2:null) Seq 39-282525697: Forwarding null to 130602634328
> > 2013-07-16 11:25:45,758 DEBUG [agent.manager.ClusteredAgentAttache]
> > (StatsCollector-3:null) Seq 50-1962541057: Forwarding null to
> 130602634328
> > 2013-07-16 11:25:45,804 DEBUG [agent.manager.ClusteredAgentAttache]
> > (AgentManager-Handler-2:null) Seq 39-282525697: Routing from
> 81375086018793
> > 2013-07-16 11:25:45,804 DEBUG [agent.manager.ClusteredAgentAttache]
> > (AgentManager-Handler-2:null) Seq 39-282525697: Link is closed
> > 2013-07-16 11:25:45,806 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> > (AgentManager-Handler-2:null) Seq 39-282525697: MgmtId 81375086018793:
> Req:
> > Resource [Host:39] is unreachable: Host 39: Link is closed
> >
> >
> > Thanks
> > Leeno
> >
> >
> > On Tue, Jul 16, 2013 at 6:10 PM, Leeno Jose.P.A <leenojos@gmail.com
> >wrote:
> >
> >> Hi Todd,
> >>
> >> Thanks for the help.
> >>
> >> I executed the steps as you mentioned above but that did not help. Still
> >> I get same error message. But I can do ping, telnet ports 22, 80 and
> 443 on
> >> XS hosts from CS.
> >>
> >> Thanks
> >> Leeno
> >>
> >>
> >> On Tue, Jul 16, 2013 at 5:12 PM, Todd Pigram <todd@toddpigram.com>
> wrote:
> >>
> >>> Did you remove the Tags on each XenServer host prior to starting?
> >>>
> >>> Management Controller Failure and Replacement
> >>>
> >>> <
> https://cwiki.apache.org/confluence/pages/editpage.action?pageId=30755366>
> >>>  Edit Page<
> https://cwiki.apache.org/confluence/pages/editpage.action?pageId=30755366>
> >>>    <
> https://cwiki.apache.org/confluence/pages/listpages.action?key=CLOUDSTACK>
> >>>  Browse Space<
> https://cwiki.apache.org/confluence/pages/listpages.action?key=CLOUDSTACK>
> >>>    <
> https://cwiki.apache.org/confluence/pages/createpage.action?spaceKey=CLOUDSTACK&fromPageId=30755366
> >
> >>>  Add Page<
> https://cwiki.apache.org/confluence/pages/createpage.action?spaceKey=CLOUDSTACK&fromPageId=30755366
> >
> >>>    <
> https://cwiki.apache.org/confluence/pages/createblogpost.action?spaceKey=CLOUDSTACK&fromPageId=30755366
> >
> >>>  Add News<
> https://cwiki.apache.org/confluence/pages/createblogpost.action?spaceKey=CLOUDSTACK&fromPageId=30755366
> >
> >>>
> >>> In setting up your cloud, you should have a backup routine for your
> >>> controller. The most important item to back up is the MySQL databases
> that
> >>> Cloudstack uses. A suitable backup script is attached to this page. In
> the
> >>> even of a cloud management controller failure, the steps to replace the
> >>> controller with a new one are:
> >>>
> >>> These instructions assume your cluster is Xenserver - Contributors
> >>> using other Hypervisor OSs, please contribute.
> >>>
> >>>    1. Setup new management server hardware
> >>>    2. Install your OS
> >>>    3. Install Cloudstack, up to and including the "Install Database
> >>>    step"
> >>>    4. Import your database backup
> >>>    5. In Xencenter, connect to your Cloudstack host pool.
> >>>    6. On each host, remove the tags on Host > General Tab > Tags by
> >>>    editing the tags and un-checking each one.
> >>>    7. On the management controller, start Cloudstack
> >>>       1. service cloud-management start
> >>>    8. the new cloud management controller will connect to each host in
> >>>    the database and push out new tags and keys to each host in the
> pool.
> >>>
> >>>
> >>> On Jul 16, 2013, at 1:13 AM, Leeno Jose.P.A <leenojos@gmail.com>
> wrote:
> >>>
> >>> After restoring the old database dump to new installation. CS is unable
> >>> to
> >>> contact Xenserver hosts. I getting following errors in
> >>> mamangement-server.log,
> >>>
> >>>
> >>> 2013-07-15 11:57:49,646 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> >>> (StatsCollector-1:null) Connection to peer opened: 130602634328, ip:
> >>> 192.168.10.251
> >>> 2013-07-15 11:57:49,652 DEBUG [agent.manager.ClusteredAgentAttache]
> >>> (StatsCollector-2:null) Seq 50-185008129: Forwarding null to
> 130602634328
> >>> 2013-07-15 11:57:49,662 DEBUG [agent.manager.ClusteredAgentAttache]
> >>> (StatsCollector-1:null) Seq 39-1272840193: Forwarding null to
> >>> 130602634328
> >>> 2013-07-15 11:57:49,699 DEBUG [agent.manager.ClusteredAgentAttache]
> >>> (AgentManager-Handler-2:null) Seq 50-185008129: Routing from
> >>> 81375086018793
> >>> 2013-07-15 11:57:49,699 DEBUG [agent.manager.ClusteredAgentAttache]
> >>> (AgentManager-Handler-2:null) Seq 50-185008129: Link is closed
> >>> 2013-07-15 11:57:49,699 DEBUG [agent.manager.ClusteredAgentAttache]
> >>> (AgentManager-Handler-3:null) Seq 39-1272840193: Routing from
> >>> 81375086018793
> >>> 2013-07-15 11:57:49,700 DEBUG [agent.manager.ClusteredAgentAttache]
> >>> (AgentManager-Handler-3:null) Seq 39-1272840193: Link is closed
> >>> 2013-07-15 11:57:49,700 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> >>> (AgentManager-Handler-3:null) Seq 39-1272840193: MgmtId 81375086018793:
> >>> Req: Resource [Host:39] is unreachable: Host 39: Link is closed
> >>>
> >>>
> >>> 2013-07-15 11:57:49,861 DEBUG [agent.manager.ClusteredAgentManagerImpl]
> >>> (AgentManager-Handler-8:null) Seq 39--1: MgmtId 81375086018793: Req:
> >>> Cancel
> >>> request received
> >>> 2013-07-15 11:57:49,861 DEBUG [agent.manager.AgentAttache]
> >>> (AgentManager-Handler-8:null) Seq 39-1272840194: Cancelling.
> >>> 2013-07-15 11:57:49,861 DEBUG [agent.manager.AgentAttache]
> >>> (StatsCollector-2:null) Seq 39-1272840194: Waiting some more time
> because
> >>> this is the current command
> >>> 2013-07-15 11:57:49,862 DEBUG [agent.manager.AgentAttache]
> >>> (StatsCollector-2:null) Seq 39-1272840194: Waiting some more time
> because
> >>> this is the current command
> >>> 2013-07-15 11:57:49,862 INFO  [utils.exception.CSExceptionErrorCode]
> >>> (StatsCollector-2:null) Could not find exception:
> >>> com.cloud.exception.OperationTimedoutException in error code list for
> >>> exceptions
> >>> 2013-07-15 11:57:49,862 WARN  [agent.manager.AgentAttache]
> >>> (StatsCollector-2:null) Seq 39-1272840194: Timed out on null
> >>> 2013-07-15 11:57:49,862 DEBUG [agent.manager.AgentAttache]
> >>> (StatsCollector-2:null) Seq 39-1272840194: Cancelling.
> >>> 2013-07-15 11:57:49,863 DEBUG [cloud.storage.StorageManagerImpl]
> >>> (StatsCollector-2:null) Unable to send storage pool command to
> >>> Pool[210|NetworkFilesystem] via 39
> >>> com.cloud.exception.OperationTimedoutException: Commands 1272840194 to
> >>> Host
> >>> 39 timed out after 3600
> >>>        at
> >>> com.cloud.agent.manager.AgentAttache.send(AgentAttache.java:429)
> >>>        at
> >>>
> com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:511)
> >>>        at
> >>>
> com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:464)
> >>>        at
> >>>
> >>>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:2347)
> >>>        at
> >>>
> >>>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:422)
> >>>        at
> >>>
> >>>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:436)
> >>>        at
> >>>
> >>>
> com.cloud.server.StatsCollector$StorageCollector.run(StatsCollector.java:316)
> >>>        at
> >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> >>>        at
> >>>
> >>>
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
> >>>        at
> >>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
> >>>        at
> >>>
> >>>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165)
> >>>        at
> >>>
> >>>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267)
> >>>        at
> >>>
> >>>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
> >>>        at
> >>>
> >>>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >>>        at java.lang.Thread.run(Thread.java:679)
> >>> 2013-07-15 11:57:49,863 INFO  [cloud.server.StatsCollector]
> >>> (StatsCollector-2:null) Unable to reach Pool[210|NetworkFilesystem]
> >>> com.cloud.exception.StorageUnavailableException: Resource
> >>> [StoragePool:210]
> >>> is unreachable: Unable to send command to the pool
> >>>        at
> >>>
> >>>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:2357)
> >>>        at
> >>>
> >>>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:422)
> >>>        at
> >>>
> >>>
> com.cloud.storage.StorageManagerImpl.sendToPool(StorageManagerImpl.java:436)
> >>>        at
> >>>
> >>>
> com.cloud.server.StatsCollector$StorageCollector.run(StatsCollector.java:316)
> >>>        at
> >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> >>>        at
> >>>
> >>>
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
> >>>        at
> >>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
> >>>        at
> >>>
> >>>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165)
> >>>        at
> >>>
> >>>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267)
> >>>        at
> >>>
> >>>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
> >>>        at
> >>>
> >>>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >>>        at java.lang.Thread.run(Thread.java:679)
> >>>
> >>>
> >>> Thanks
> >>> Leeno
> >>>
> >>>
> >>> On Tue, Jul 16, 2013 at 10:21 AM, Leeno Jose.P.A <leenojos@gmail.com>
> >>> wrote:
> >>>
> >>> This is a dev box. We are planning a HA enabled environment for prod
> >>> setup. Thanks Geoff.
> >>>
> >>>
> >>> On Tue, Jul 16, 2013 at 12:11 AM, Geoff Higginbottom <
> >>> geoff.higginbottom@shapeblue.com> wrote:
> >>>
> >>> Hi Leeno,
> >>>
> >>> It theory that should work, but obviously you will lose all changes
> made
> >>> since the dump was taken.  If any new VMs have been created, they will
> >>> get
> >>> purged by the system etc.
> >>>
> >>> I would highly recommend splitting the DB and the Management Server,
> and
> >>> if possible add a 2nd instance of each.
> >>>
> >>> Regards
> >>>
> >>> Geoff Higginbottom
> >>>
> >>> D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581
> >>>
> >>> geoff.higginbottom@shapeblue.com
> >>>
> >>> -----Original Message-----
> >>> From: Leeno Jose.P.A [mailto:leenojos@gmail.com]
> >>> Sent: 15 July 2013 18:46
> >>> To: users@cloudstack.apache.org
> >>> Subject: Re: Rebuilding management server
> >>>
> >>> Hi Geoff,
> >>>
> >>> 1. I have only one management server.
> >>> 2. Management server is not functioning now but 'cloud' database dump
> is
> >>> available in backup. CS version was 4.1.0 Hosts were Xenserver 6.1.0 3.
> >>> DB
> >>> server was on same machine where management server installed.
> >>>
> >>> Now I am planning to do a fresh install of CS 4.1.0 and restore cloud
> >>> database with old installation dump, which is available in backup. Will
> >>> it
> >>> work?
> >>>
> >>> Thanks
> >>> Leeno
> >>>
> >>>
> >>> On Mon, Jul 15, 2013 at 9:56 PM, Geoff Higginbottom <
> >>> geoff.higginbottom@shapeblue.com> wrote:
> >>>
> >>> The Management Servers are 'Stateless' so as Chip points out, it's the
> >>> DB that stores all the info.
> >>>
> >>> How you actually go about it depends on your current setup.
> >>>
> >>> 1. How many management servers do you currently have?
> >>> 2. Are the original Management Server(s) still functioning, or are
> >>> they down?
> >>> 3. Is DB on a separate server, or the same as the Management Server?
> >>>
> >>> Regards
> >>>
> >>> Geoff Higginbottom
> >>>
> >>> D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581
> >>>
> >>> geoff.higginbottom@shapeblue.com
> >>>
> >>> -----Original Message-----
> >>> From: Chip Childers [mailto:chip.childers@sungard.com]
> >>> Sent: 15 July 2013 15:50
> >>> To: users@cloudstack.apache.org
> >>> Subject: Re: Rebuilding management server
> >>>
> >>> On Mon, Jul 15, 2013 at 03:19:42PM +0530, Leeno Jose.P.A wrote:
> >>>
> >>> Hi Users,
> >>>
> >>> Has anyone tried to rebuild management server with Xenserver hosts?
> >>> If yes, could you please share experience?
> >>>
> >>>
> >>> --
> >>> Leeno Jose .P.A
> >>>
> >>>
> >>> I have not, but one of the most critical aspects of this is to ensure
> >>> that your database is retained.
> >>>
> >>> This email and any attachments to it may be confidential and are
> >>> intended solely for the use of the individual to whom it is addressed.
> >>> Any views or opinions expressed are solely those of the author and do
> >>> not necessarily represent those of Shape Blue Ltd or related
> >>> companies. If you are not the intended recipient of this email, you
> >>> must neither take any action based upon its contents, nor copy or show
> >>> it to anyone. Please contact the sender if you believe you have
> >>> received this email in error. Shape Blue Ltd is a company incorporated
> >>> in England & Wales. ShapeBlue Services India LLP is operated under
> >>> license from Shape Blue Ltd. ShapeBlue is a registered trademark.
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> Leeno Jose .P.A
> >>> This email and any attachments to it may be confidential and are
> intended
> >>> solely for the use of the individual to whom it is addressed. Any views
> >>> or
> >>> opinions expressed are solely those of the author and do not
> necessarily
> >>> represent those of Shape Blue Ltd or related companies. If you are not
> >>> the
> >>> intended recipient of this email, you must neither take any action
> based
> >>> upon its contents, nor copy or show it to anyone. Please contact the
> >>> sender
> >>> if you believe you have received this email in error. Shape Blue Ltd
> is a
> >>> company incorporated in England & Wales. ShapeBlue Services India LLP
> is
> >>> operated under license from Shape Blue Ltd. ShapeBlue is a registered
> >>> trademark.
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> Leeno Jose .P.A
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> Leeno Jose .P.A
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Todd Pigram
> >>> todd@toddpigram.com
> >>>
> >>>
> >>>
> >>
> >>
> >> --
> >> Leeno Jose .P.A
> >>
> >
> >
> >
> > --
> > Leeno Jose .P.A
> >
>



-- 
Leeno Jose .P.A

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message