Return-Path: X-Original-To: apmail-cloudstack-issues-archive@www.apache.org Delivered-To: apmail-cloudstack-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E52C71902B for ; Sun, 20 Mar 2016 14:58:33 +0000 (UTC) Received: (qmail 31059 invoked by uid 500); 20 Mar 2016 14:58:33 -0000 Delivered-To: apmail-cloudstack-issues-archive@cloudstack.apache.org Received: (qmail 31006 invoked by uid 500); 20 Mar 2016 14:58:33 -0000 Mailing-List: contact issues-help@cloudstack.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cloudstack.apache.org Delivered-To: mailing list issues@cloudstack.apache.org Received: (qmail 30988 invoked by uid 500); 20 Mar 2016 14:58:33 -0000 Delivered-To: apmail-incubator-cloudstack-issues@incubator.apache.org Received: (qmail 30985 invoked by uid 99); 20 Mar 2016 14:58:33 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 20 Mar 2016 14:58:33 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 885C22C1F5A for ; Sun, 20 Mar 2016 14:58:33 +0000 (UTC) Date: Sun, 20 Mar 2016 14:58:33 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: cloudstack-issues@incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CLOUDSTACK-9255) Unable to start VM DomainRouter due to error in finalizeStart, not retrying MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CLOUDSTACK-9255?page=3Dcom.atla= ssian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId= =3D15203320#comment-15203320 ]=20 ASF GitHub Bot commented on CLOUDSTACK-9255: -------------------------------------------- Github user milamberspace commented on the pull request: https://github.com/apache/cloudstack/pull/1356#issuecomment-198946677 =20 FYI This PR fixes the bug CLOUDSTACK-9255 https://issues.apache.org/jira/browse/CLOUDSTACK-9255 =20 (perhaps need to backport this PR on 4.6 branch.) > Unable to start VM DomainRouter due to error in finalizeStart, not retryi= ng > -------------------------------------------------------------------------= -- > > Key: CLOUDSTACK-9255 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-925= 5 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the defa= ult.)=20 > Components: Virtual Router > Affects Versions: 4.7.0, 4.6.2, 4.8.0, 4.7.1 > Environment: Ubuntu 14.04.3 > KVM > NFS (primary/secondary) > Reporter: Milamber > Assignee: Wilder Rodrigues > Fix For: 4.7.2, 4.9.0 > > Attachments: anon-rvr-2nd-after-20.log > > > I've spent 3 days with the same issue : unable to restart with clean up a= network (virtual router or redondant virtual router) if the network have a= t least 20 virtual machines. > I've tested with CS 4.6.2, 4.7.0, 4.7.1RC1, 4.8.0RC1, same problem. I've = used the system vm from apt-get.eu and last builds from jenkins. > My tests are made with hosts/mgr on Ubuntu 14.04.3 / KVM / NFS primary/se= condary. > My test case (with ansible modules) : > 1/ create a new network (normal or RVR) > 2/ create 20 vms (same params, just the name is changes) > wait the end of creation > 3/ restart the network with clean up option > 4/ wait the restart, after some minutes, an error message arrived : "Fail= ed to restart network" > The trace in management.log are: > 2016-01-23 23:02:51,503 ERROR [c.c.v.VmWorkJobDispatcher] (Work-Job-Execu= tor-51:ctx-9ed51622 job-268/job-271) (logid:b9a521fa) Unable to complete As= yncJobVO {id:271, userId: 2, accountId: 2, instanceType: null, instanceId: = null, cmd: com.cloud.vm.VmWorkStart, cmdInfo: rO0ABXNyABhjb20uY2xvdWQudm0uV= m1Xb3JrU3RhcnR9cMGsvxz73gIAC0oABGRjSWRMAAZhdm9pZHN0ADBMY29tL2Nsb3VkL2RlcGxv= eS9EZXBsb3ltZW50UGxhbm5lciRFeGNsdWRlTGlzdDtMAAljbHVzdGVySWR0ABBMamF2YS9sYW5= nL0xvbmc7TAAGaG9zdElkcQB-AAJMAAtqb3VybmFsTmFtZXQAEkxqYXZhL2xhbmcvU3RyaW5nO0= wAEXBoeXNpY2FsTmV0d29ya0lkcQB-AAJMAAdwbGFubmVycQB-AANMAAVwb2RJZHEAfgACTAAGc= G9vbElkcQB-AAJMAAlyYXdQYXJhbXN0AA9MamF2YS91dGlsL01hcDtMAA1yZXNlcnZhdGlvbklk= cQB-AAN4cgATY29tLmNsb3VkLnZtLlZtV29ya5-ZtlbwJWdrAgAESgAJYWNjb3VudElkSgAGdXN= lcklkSgAEdm1JZEwAC2hhbmRsZXJOYW1lcQB-AAN4cAAAAAAAAAACAAAAAAAAAAIAAAAAAAAAMn= QAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAAHBwcHBwcHBwc3IAEWphdmEudXRpb= C5IYXNoTWFwBQfawcMWYNEDAAJGAApsb2FkRmFjdG9ySQAJdGhyZXNob2xkeHA_QAAAAAAADHcI= AAAAEAAAAAF0AA5SZXN0YXJ0TmV0d29ya3QAP3JPMEFCWE55QUJGcVlYWmhMbXhoYm1jdVFtOXZ= iR1ZoYnMwZ2NvRFZuUHJ1QWdBQldnQUZkbUZzZFdWNGNBRXhw, cmdVersion: 0, status: I= N_PROGRESS, processStatus: 0, resultCode: 0, result: null, initMsid: 146456= 419427, completeMsid: null, lastUpdated: null, lastPolled: null, created: S= at Jan 23 22:56:00 CET 2016}, job origin:268 > com.cloud.exception.AgentUnavailableException: Resource [Host:1] is unrea= chable: Host 1: Unable to start instance due to Unable to start VM[DomainRo= uter|r-50-VM] due to error in finalizeStart, not retrying > at com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMac= hineManagerImpl.java:1119) > at com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMac= hineManagerImpl.java:4578) > at sun.reflect.GeneratedMethodAccessor374.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAc= cessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandle= rProxy.java:107) > at com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMach= ineManagerImpl.java:4734) > at com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:1= 02) > at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.ru= nInContext(AsyncJobManagerImpl.java:554) > at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run= (ManagedContextRunnable.java:49) > at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1= .call(DefaultManagedContext.java:56) > at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.c= allWithContext(DefaultManagedContext.java:103) > at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.r= unWithContext(DefaultManagedContext.java:53) > at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(M= anagedContextRunnable.java:46) > at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.ru= n(AsyncJobManagerImpl.java:502) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java= :471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecut= or.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecu= tor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: com.cloud.utils.exception.ExecutionException: Unable to start = VM[DomainRouter|r-50-VM] due to error in finalizeStart, not retrying > at com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMac= hineManagerImpl.java:1083) > at com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMac= hineManagerImpl.java:4578) > at sun.reflect.GeneratedMethodAccessor374.invoke(Unknown Source) > ... 17 more > During the restart of the network I can connect on the VR with link local= link over ssh, the last lines shows: > 2016-01-23 22:02:39,780 configure.py __init__:128 AclIP created for rule= =3D=3D> {'last_port': 65535, u'protocol': u'tcp', u'revoked': False, u'alr= eady_added': True, u'source_cidr_list': [u'0.0.0.0/0'], 'cidr': [u'0.0.0.0/= 0'], u'id': 52, u'src_ip': u'192.168.13.30', u'purpose': u'Firewall', 'allo= wed': True, 'action': 'ACCEPT', u'src_port_range': [1, 65535], u'traffic_ty= pe': u'Ingress', 'type': u'tcp', u'default_egress_policy': False, 'first_po= rt': 1} > 2016-01-23 22:02:39,780 configure.py add_rule:165 Current ACL IP directi= on is =3D=3D> ingress > 2016-01-23 22:02:39,780 merge.py load:60 Loading data bag type forwardin= grules > Broadcast message from root@r-50-VM (Sat Jan 23 22:02:45 2016): > The system is going down for system halt NOW! > Broadcast message from root@r-50-VM (Sat Jan 23 22:02:45 2016): > Power button pressed > The system is going down for system halt NOW! > /opt/cloud/bin/vr_cfg.sh: line 60: 16845 Killed /opt/clo= ud/bin/update_config.py vm_metadata.json > Sat Jan 23 22:02:46 UTC 2016 : VR config: executing failed: /opt/cloud/bi= n/update_config.py vm_metadata.json > Connection to 169.254.2.186 closed by remote host. > Connection to 169.254.2.186 closed. > Perhaps that was a timeout issue? if I create one VM or 10 VMs, the netwo= rk restart works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)