cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrei Mikhailovsky <and...@arhont.com>
Subject Re: Virtual Routers not starting up after host restart
Date Mon, 02 Feb 2015 21:23:57 GMT
Mohammad, any errors on the host side? Can you check if VRs are being created on the host?
Also, check if you can get the console (from the hypervisor and not from the ACS GUI). Perhaps
there is a clue on what's happening. 

By the way, are you other system vms working okay? Like ssvm and cpvm? 

Andrei 

----- Original Message -----

> From: "Mohammad Rastgoo" <mohammad@synapti.ca>
> To: users@cloudstack.apache.org
> Sent: Monday, 2 February, 2015 7:42:13 PM
> Subject: Re: Virtual Routers not starting up after host restart

> UP and green.

> On Mon, Feb 2, 2015 at 2:34 PM, Andrei Mikhailovsky
> <andrei@arhont.com>
> wrote:

> > From what I can see, the ACS is unable to contact your hypervisor
> > host
> > server:
> >
> >
> > 2015-02-02 13:19:17,585 ERROR [c.c.v.VmWorkJobHandlerProxy]
> > (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78)
> > Invocation
> > exception, caused by:
> > com.cloud.exception.AgentUnavailableException:
> > Resource [Host:1] is unreachable: Host 1: Unable to start instance
> > due to
> > Unable to start VM[DomainRouter|r-29-VM] due to error in
> > finalizeStart, not
> > retrying
> >
> >
> > What is the status of your host server? Is it shown as
> > Up/Alert/Disconnected/Connecting?
> >
> > Andrei
> >
> > ----- Original Message -----
> > > From: "Mohammad Rastgoo" <mohammad@synapti.ca>
> > > To: users@cloudstack.apache.org
> > > Sent: Monday, 2 February, 2015 7:27:30 PM
> > > Subject: Re: Virtual Routers not starting up after host restart
> > >
> > > Andrei,
> > >
> > > Below is the partial MS log. I have marked couple parts in bold.
> > > Might be
> > > dumb but my first thought was maybe iptables is causing it, yet I
> > > have no
> > > good explanations for it.
> > >
> > > 2015-02-02 13:17:14,152 WARN [o.a.c.alerts]
> > > (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78)
> > > alertType:: 9 // dataCenterId:: 1 // podId:: 1 // clusterId::
> > > null //
> > > message:: Command: com.cloud.agent.api.check.CheckSshCommand
> > > failed while
> > > starting virtual router
> > > 2015-02-02 13:17:14,233 WARN
> > [c.c.n.r.VirtualNetworkApplianceManagerImpl]
> > > (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78)
> > > Command:
> > > com.cloud.agent.api.check.CheckSshCommand failed while starting
> > > virtual
> > > router
> > > 2015-02-02 13:17:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor]
> > > (Timer-1:ctx-3041bbe4) Task (job-244) has been pending for 1134
> > > seconds
> > > 2015-02-02 13:17:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor]
> > > (Timer-1:ctx-3041bbe4) Task (job-245) has been pending for 1133
> > > seconds
> > > 2015-02-02 13:18:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor]
> > > (Timer-1:ctx-526a6af6) Task (job-244) has been pending for 1194
> > > seconds
> > > 2015-02-02 13:18:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor]
> > > (Timer-1:ctx-526a6af6) Task (job-245) has been pending for 1193
> > > seconds
> > > 2015-02-02 13:19:16,969 ERROR [c.c.v.VirtualMachineManagerImpl]
> > > (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78)
> > > Failed
> > to
> > > start instance VM[DomainRouter|r-29-VM]
> > > com.cloud.utils.exception.ExecutionException: Unable to start
> > > VM[DomainRouter|r-29-VM] due to error in finalizeStart, not
> > > retrying
> > > 2015-02-02 13:19:17,518 DEBUG [c.c.c.CapacityManagerImpl]
> > > (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78)
> > > VM state
> > > transitted from :Starting to Stopped with event:
> > > OperationFailedvm's
> > > original host id: null new host id: null host id before state
> > transition: 1
> > > 2015-02-02 13:19:17,585 ERROR [c.c.v.VmWorkJobHandlerProxy]
> > > (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78)
> > Invocation
> > > exception, caused by:
> > > com.cloud.exception.AgentUnavailableException:
> > > Resource [Host:1] is unreachable: Host 1: Unable to start
> > > instance due to
> > > Unable to start VM[DomainRouter|r-29-VM] due to error in
> > > finalizeStart,
> > not
> > > retrying
> > > 2015-02-02 13:19:17,585 INFO [c.c.v.VmWorkJobHandlerProxy]
> > > (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78)
> > > Rethrow
> > > exception com.cloud.exception.AgentUnavailableException: Resource
> > [Host:1]
> > > is unreachable: Host 1: Unable to start instance due to Unable to
> > > start
> > > VM[DomainRouter|r-29-VM] due to error in finalizeStart, not
> > > retrying
> > > 2015-02-02 13:19:17,586 ERROR [c.c.v.VmWorkJobDispatcher]
> > > (Work-Job-Executor-24:ctx-114e980e job-244/job-245) Unable to
> > > complete
> > > AsyncJobVO {id:245, userId: 2, accountId: 2, instanceType: null,
> > > instanceId: null, cmd: com.cloud.vm.VmWorkStart, cmdInfo:
> > >
> > rO0ABXNyABhjb20uY2xvdWQudm0uVm1Xb3JrU3RhcnR9cMGsvxz73gIAC0oABGRjSWRMAAZhdm9pZHN0ADBMY29tL2Nsb3VkL2RlcGxveS9EZXBsb3ltZW50UGxhbm5lciRFeGNsdWRlTGlzdDtMAAljbHVzdGVySWR0ABBMamF2YS9sYW5nL0xvbmc7TAAGaG9zdElkcQB-AAJMAAtqb3VybmFsTmFtZXQAEkxqYXZhL2xhbmcvU3RyaW5nO0wAEXBoeXNpY2FsTmV0d29ya0lkcQB-AAJMAAdwbGFubmVycQB-AANMAAVwb2RJZHEAfgACTAAGcG9vbElkcQB-AAJMAAlyYXdQYXJhbXN0AA9MamF2YS91dGlsL01hcDtMAA1yZXNlcnZhdGlvbklkcQB-AAN4cgATY29tLmNsb3VkLnZtLlZtV29ya5-ZtlbwJWdrAgAESgAJYWNjb3VudElkSgAGdXNlcklkSgAEdm1JZEwAC2hhbmRsZXJOYW1lcQB-AAN4cAAAAAAAAAACAAAAAAAAAAIAAAAAAAAAHXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAAHBwcHBwcHBwc3IAEWphdmEudXRpbC5IYXNoTWFwBQfawcMWYNEDAAJGAApsb2FkRmFjdG9ySQAJdGhyZXNob2xkeHA_QAAAAAAADHcIAAAAEAAAAAF0AA5SZXN0YXJ0TmV0d29ya3QAP3JPMEFCWE55QUJGcVlYWmhMbXhoYm1jdVFtOXZiR1ZoYnMwZ2NvRFZuUHJ1QWdBQldnQUZkbUZzZFdWNGNBRXhw,
> > > cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode:
> > > 0,
> > > result: null, initMsid: 161333667508, completeMsid: null,
> > > lastUpdated:
> > > null, lastPolled: null, created: Mon Feb 02 12:58:55 EST 2015},
> > > job
> > > origin:244
> > >
> > >
> > > *com.cloud.exception.AgentUnavailableException: Resource [Host:1]
> > > is
> > > unreachable: Host 1: Unable to start instance due to Unable to
> > > start
> > > VM[DomainRouter|r-29-VM] due to error in finalizeStart, not
> > retryingCaused
> > > by: com.cloud.utils.exception.ExecutionException: Unable to start
> > > VM[DomainRouter|r-29-VM] due to error in finalizeStart, not
> > > retrying*
> > >
> > > 2015-02-02 13:19:17,930 WARN [o.a.c.e.o.NetworkOrchestrator]
> > > (API-Job-Executor-16:ctx-0dfa85ec job-244 ctx-2d8a3616) Failed to
> > implement
> > > network Ntwk[f3a318a2-d6f0-4fcb-be94-4e4586cc20a3|Guest|7]
> > > elements and
> > > resources as a part of network restart due to
> > > java.lang.RuntimeException: *Job failed due to exception Resource
> > [Host:1]
> > > is unreachable: Host 1: Unable to start instance due to Unable to
> > > start
> > > VM[DomainRouter|r-29-VM] due to error in finalizeStart, not
> > > retrying*
> > > 2015-02-02 13:19:17,930 WARN [c.c.n.NetworkServiceImpl]
> > > (API-Job-Executor-16:ctx-0dfa85ec job-244 ctx-2d8a3616) Network
> > > id=207
> > > failed to restart.
> > > 2015-02-02 13:19:18,135 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> > > (API-Job-Executor-16:ctx-0dfa85ec job-244) Complete async
> > > job-244,
> > > jobStatus: FAILED, resultCode: 530, result:
> > >
> > org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":530,"errortext":"Failed
> > > to restart network"}
> > > 2015-02-02 13:23:18,345 WARN
> > > [c.c.a.d.ParamGenericValidationWorker]
> > > (catalina-exec-19:ctx-b153f3e0 ctx-d5075366) Received unknown
> > > parameters
> > > for command listNetworks. Unknown parameters : details
> > >
> > > On Mon, Feb 2, 2015 at 2:19 PM, Andrei Mikhailovsky
> > > <andrei@arhont.com>
> > > wrote:
> > >
> > > > Mohammad, what does the management server log say when you try
> > > > to start
> > > > VRs? It should have the clue why it is not starting
> > > >
> > > > Andrei
> > > >
> > > > ----- Original Message -----
> > > >
> > > > > From: "Mohammad Rastgoo" <mohammad@synapti.ca>
> > > > > To: users@cloudstack.apache.org
> > > > > Sent: Monday, 2 February, 2015 6:06:41 PM
> > > > > Subject: Virtual Routers not starting up after host restart
> > > >
> > > > > Hi,
> > > >
> > > > > Thanks for reading this.
> > > >
> > > > > I have this setup:
> > > > > server 1: MS + DB
> > > > > server 2: secondary storage NFS
> > > > > server 3: kvm - local primary
> > > > > (all centos 6.6)
> > > > > net1: isolated network 10.0.0.0/x
> > > > > net2: shared network (public ip)
> > > >
> > > > > Here are the steps I took:
> > > >
> > > > > 1- stopped all VMs
> > > > > 2- stopped system VMs (not VRs)
> > > > > 3- yum updated glibc + reboot on all servers
> > > >
> > > > > Now here is the situation, net2 has remained in setup state
> > > > > and net1
> > > > > on
> > > > > allocated.
> > > >
> > > > > sys VMs are back on. VRs are at starting and then stopped.
> > > >
> > > > > so far, I have deleted VRs and restarted networks + clean up.
> > > > > no
> > > > > luck.
> > > >
> > > > > has anyone encountered the same problem? am I missing
> > > > > anything here?
> > > >
> > > > > Any help is highly appreciated. Tnx
> > > >
> > > > > --
> > > > > Mohammad Rastgoo
> > > >
> > >
> > >
> > >
> > > --
> > > Mohammad Rastgoo
> > > Founder & CEO
> > >
> >

> --
> Mohammad Rastgoo
> Founder & CEO

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message