cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dsclose (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CLOUDSTACK-9024) Restart network fails if redundant router is missing
Date Wed, 04 Nov 2015 13:44:27 GMT

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-9024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14989542#comment-14989542
] 

dsclose commented on CLOUDSTACK-9024:
-------------------------------------

This appears to have been introduced as part of CLOUDSTACK-6433.
Relevant commit: https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=59a9db3
Commit title: Don't return success if only one RvR builds successfully.

The point being that we need the network restart to succeed if only one virtual router is
built. This conforms to Cloudstack documentation on how to deal with faulty routers:

"If you are sure that a virtual router is down forever, or no longer functions as expected,
destroy it. ... Recreate the missing router by using the restartNetwork API with cleanup=false
parameter."
http://cloudstack-administration.readthedocs.org/en/latest/troubleshooting.html#recovering-a-lost-virtual-router

> Restart network fails if redundant router is missing
> ----------------------------------------------------
>
>                 Key: CLOUDSTACK-9024
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9024
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: API, Network Controller, Virtual Router
>    Affects Versions: 4.5.2
>         Environment: Cloudstack installed on CentOS 6.5
>            Reporter: dsclose
>
> Restart network action fails if a network is missing a redundant virtual router. This
occurs if triggered via the UI (Networks -> Select Network -> Restart -> Clean-ip:
False -> OK) or via the API.
> Steps to reproduce:
> ------------------------
> 1. Create a redundant router network offering.
> 2. Create a network using the redundant router network offering.
> 3. Destroy a redundant router from the network. Leave one functioning.
> 4. Initiate the restart network action or restartNetwork API call with clean-up set to
False.
> Expected Behaviour:
> --------------------------
> Cloudstack should boot a new redundant virtual router to replace the missing router.
The Network Restart action should return successfully.
> Actual Behaviour:
> -----------------------
> Cloudstack boots a replacement redundant router but the API call returns unsucessful.
The Web UI reports that the router fails.
> Timeline:
> -----------
> 2015-11-03 17:12:08,256 Destroying router "r-985-VM".
> 2015-11-03 17:12:24,511 Performing network restart.
> 2015-11-03 17:14:02,851 Failed to restart network
> Management Log Sample
> ---------------------------------
> 2015-11-03 17:12:14,943 INFO  [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-12:ctx-a671c200
job-163/job-164) Remove job-164 from job monitoring
> 2015-11-03 17:12:15,739 INFO  [o.a.c.s.v.VolumeServiceImpl] (API-Job-Executor-12:ctx-33b24483
job-163 ctx-4d95a357) Volume 985 is not referred anywhere, remove it from volumes table
> 2015-11-03 17:12:15,850 INFO  [o.a.c.f.j.i.AsyncJobMonitor] (API-Job-Executor-12:ctx-33b24483
job-163) Remove job-163 from job monitoring
> 2015-11-03 17:12:18,698 INFO  [o.a.c.f.j.i.AsyncJobMonitor] (API-Job-Executor-13:ctx-c29ad7f0
job-165) Add job-165 into job monitoring
> 2015-11-03 17:12:18,985 INFO  [c.c.n.r.VirtualNetworkApplianceManagerImpl] (API-Job-Executor-13:ctx-c29ad7f0
job-165 ctx-7945f6f9) Use same MAC as previous RvR, the MAC is 06:9c:86:00:00:0e
> 2015-11-03 17:12:19,829 INFO  [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-13:ctx-06672650
job-165/job-166) Add job-166 into job monitoring
> 2015-11-03 17:12:20,078 INFO  [c.c.s.StorageManagerImpl] (Work-Job-Executor-13:ctx-06672650
job-165/job-166 ctx-81c163bb) Storage pool null (1) does not supply IOPS capacity, assuming
enough capacity
> 2015-11-03 17:12:40,248 INFO  [c.c.v.VirtualMachineManagerImpl] (DirectAgentCronJob-492:ctx-1fb6ecea)
There is pending job or HA tasks working on the VM. vm id: 992, postpone power-change report
by resetting power-change counters
> 2015-11-03 17:13:40,384 INFO  [c.c.v.VirtualMachineManagerImpl] (DirectAgentCronJob-280:ctx-846ef4f0)
There is pending job or HA tasks working on the VM. vm id: 992, postpone power-change report
by resetting power-change counters
> 2015-11-03 17:13:49,799 INFO  [o.a.c.f.j.i.AsyncJobManagerImpl] (AsyncJobMgr-Heartbeat-1:ctx-10ff7f5f)
Begin cleanup expired async-jobs
> 2015-11-03 17:13:49,825 INFO  [o.a.c.f.j.i.AsyncJobManagerImpl] (AsyncJobMgr-Heartbeat-1:ctx-10ff7f5f)
End cleanup expired async-jobs
> 2015-11-03 17:13:55,688 INFO  [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-13:ctx-06672650
job-165/job-166) Remove job-166 from job monitoring
> 2015-11-03 17:13:55,730 WARN  [o.a.c.e.o.NetworkOrchestrator] (API-Job-Executor-13:ctx-c29ad7f0
job-165 ctx-7945f6f9) Failed to implement network Ntwk[208|Guest|15] elements and resources
as a part of network restart due to
> com.cloud.exception.ResourceUnavailableException: Resource [DataCenter:1] is unreachable:
Can't find all necessary running routers!
>         at com.cloud.network.element.VirtualRouterElement.implement(VirtualRouterElement.java:202)
>         at org.apache.cloudstack.engine.orchestration.NetworkOrchestrator.implementNetworkElementsAndResources(NetworkOrchestrator.java:1103)
>         at org.apache.cloudstack.engine.orchestration.NetworkOrchestrator.restartNetwork(NetworkOrchestrator.java:2546)
>         at com.cloud.network.NetworkServiceImpl.restartNetwork(NetworkServiceImpl.java:1891)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
>         at org.apache.cloudstack.network.contrail.management.EventUtils$EventInterceptor.invoke(EventUtils.java:106)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
>         at com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:51)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
>         at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
>         at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
>         at com.sun.proxy.$Proxy157.restartNetwork(Unknown Source)
>         at org.apache.cloudstack.api.command.user.network.RestartNetworkCmd.execute(RestartNetworkCmd.java:95)
>         at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:141)
>         at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:108)
>         at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:537)
>         at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>         at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>         at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>         at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>         at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>         at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> 2015-11-03 17:13:55,732 WARN  [c.c.n.NetworkServiceImpl] (API-Job-Executor-13:ctx-c29ad7f0
job-165 ctx-7945f6f9) Network id=208 failed to restart.
> 2015-11-03 17:13:55,806 INFO  [o.a.c.f.j.i.AsyncJobMonitor] (API-Job-Executor-13:ctx-c29ad7f0
job-165) Remove job-165 from job monitoring
> 2015-11-03 17:14:00,988 INFO  [c.c.n.r.VirtualNetworkApplianceManagerImpl] (RedundantRouterStatusMonitor-7:ctx-5a3246e2)
Redundant virtual router (name: r-992-VM, id: 992)  just switch from UNKNOWN to BACKUP



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message