Return-Path: X-Original-To: apmail-cloudstack-users-archive@www.apache.org Delivered-To: apmail-cloudstack-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A5E33178E7 for ; Tue, 3 Feb 2015 20:24:40 +0000 (UTC) Received: (qmail 43536 invoked by uid 500); 3 Feb 2015 20:24:40 -0000 Delivered-To: apmail-cloudstack-users-archive@cloudstack.apache.org Received: (qmail 43491 invoked by uid 500); 3 Feb 2015 20:24:40 -0000 Mailing-List: contact users-help@cloudstack.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@cloudstack.apache.org Delivered-To: mailing list users@cloudstack.apache.org Received: (qmail 43479 invoked by uid 99); 3 Feb 2015 20:24:40 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Feb 2015 20:24:40 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW X-Spam-Check-By: apache.org Received-SPF: error (nike.apache.org: local policy) Received: from [74.125.82.42] (HELO mail-wg0-f42.google.com) (74.125.82.42) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Feb 2015 20:24:13 +0000 Received: by mail-wg0-f42.google.com with SMTP id x13so46845629wgg.1 for ; Tue, 03 Feb 2015 12:23:51 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=e98tLq/iK07KoWtGFis7kpaF+MKmO19exR5zP3dzOjc=; b=j1+GxBt7GT6Hts2hPZd27R7LmNGTRI4nkMh3sLrdT8zY/IPfBQUVNGGXTffSl5C4tT 2GvRpzInt8VKlfE74wr8dzIK+wyu/EU/x4yZBWsvIJ6YO8M9pr1SXLFJphUkd/wVs0oK pzcKqfRoqw7APIVOvNZi9o/FsJa7KRjRTUDvwZ+AUlTeTiR1Hv6dxAyDWUOSMUcWln+m wxkXzJ9x9hrsbRL5hMrc1PenGrnanwEwa92JdIdZ4ZAMvoqW2P1sb0crOH28azF5fMZM Ztksttrtsaiao2mZrkMunfdHvpPEkn1AeNa5zcS+BvlJ5Gwm6z66mgsztmW+gYwLef27 o9KQ== X-Gm-Message-State: ALoCoQmkqvB2in+03SLqtAhB8sgkhuDWfpGvFVtrQXsgXFxW7WBRpIlTfiujXv/B32QFkBpp7odm MIME-Version: 1.0 X-Received: by 10.194.108.202 with SMTP id hm10mr60834483wjb.72.1422995031863; Tue, 03 Feb 2015 12:23:51 -0800 (PST) Received: by 10.195.13.34 with HTTP; Tue, 3 Feb 2015 12:23:51 -0800 (PST) X-Originating-IP: [24.114.47.26] In-Reply-To: <22901994.121.1422912236071.JavaMail.andrei@tuchka> References: <15746892.21.1422904761748.JavaMail.andrei@tuchka> <799727500.62628.1422905694303.JavaMail.zimbra@arhont.com> <22901994.121.1422912236071.JavaMail.andrei@tuchka> Date: Tue, 3 Feb 2015 15:23:51 -0500 Message-ID: Subject: Re: Virtual Routers not starting up after host restart From: Mohammad Rastgoo To: users@cloudstack.apache.org Content-Type: multipart/alternative; boundary=089e010d8a766449e7050e34d919 X-Virus-Checked: Checked by ClamAV on apache.org --089e010d8a766449e7050e34d919 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable top in cli comes back with qemu process but nothing starts. Here is the agent pastebin log http://pastebin.com/cf5KY52Q On Mon, Feb 2, 2015 at 4:23 PM, Andrei Mikhailovsky wrote: > Mohammad, any errors on the host side? Can you check if VRs are being > created on the host? Also, check if you can get the console (from the > hypervisor and not from the ACS GUI). Perhaps there is a clue on what's > happening. > > By the way, are you other system vms working okay? Like ssvm and cpvm? > > Andrei > > ----- Original Message ----- > > > From: "Mohammad Rastgoo" > > To: users@cloudstack.apache.org > > Sent: Monday, 2 February, 2015 7:42:13 PM > > Subject: Re: Virtual Routers not starting up after host restart > > > UP and green. > > > On Mon, Feb 2, 2015 at 2:34 PM, Andrei Mikhailovsky > > > > wrote: > > > > From what I can see, the ACS is unable to contact your hypervisor > > > host > > > server: > > > > > > > > > 2015-02-02 13:19:17,585 ERROR [c.c.v.VmWorkJobHandlerProxy] > > > (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) > > > Invocation > > > exception, caused by: > > > com.cloud.exception.AgentUnavailableException: > > > Resource [Host:1] is unreachable: Host 1: Unable to start instance > > > due to > > > Unable to start VM[DomainRouter|r-29-VM] due to error in > > > finalizeStart, not > > > retrying > > > > > > > > > What is the status of your host server? Is it shown as > > > Up/Alert/Disconnected/Connecting? > > > > > > Andrei > > > > > > ----- Original Message ----- > > > > From: "Mohammad Rastgoo" > > > > To: users@cloudstack.apache.org > > > > Sent: Monday, 2 February, 2015 7:27:30 PM > > > > Subject: Re: Virtual Routers not starting up after host restart > > > > > > > > Andrei, > > > > > > > > Below is the partial MS log. I have marked couple parts in bold. > > > > Might be > > > > dumb but my first thought was maybe iptables is causing it, yet I > > > > have no > > > > good explanations for it. > > > > > > > > 2015-02-02 13:17:14,152 WARN [o.a.c.alerts] > > > > (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) > > > > alertType:: 9 // dataCenterId:: 1 // podId:: 1 // clusterId:: > > > > null // > > > > message:: Command: com.cloud.agent.api.check.CheckSshCommand > > > > failed while > > > > starting virtual router > > > > 2015-02-02 13:17:14,233 WARN > > > [c.c.n.r.VirtualNetworkApplianceManagerImpl] > > > > (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) > > > > Command: > > > > com.cloud.agent.api.check.CheckSshCommand failed while starting > > > > virtual > > > > router > > > > 2015-02-02 13:17:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor] > > > > (Timer-1:ctx-3041bbe4) Task (job-244) has been pending for 1134 > > > > seconds > > > > 2015-02-02 13:17:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor] > > > > (Timer-1:ctx-3041bbe4) Task (job-245) has been pending for 1133 > > > > seconds > > > > 2015-02-02 13:18:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor] > > > > (Timer-1:ctx-526a6af6) Task (job-244) has been pending for 1194 > > > > seconds > > > > 2015-02-02 13:18:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor] > > > > (Timer-1:ctx-526a6af6) Task (job-245) has been pending for 1193 > > > > seconds > > > > 2015-02-02 13:19:16,969 ERROR [c.c.v.VirtualMachineManagerImpl] > > > > (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) > > > > Failed > > > to > > > > start instance VM[DomainRouter|r-29-VM] > > > > com.cloud.utils.exception.ExecutionException: Unable to start > > > > VM[DomainRouter|r-29-VM] due to error in finalizeStart, not > > > > retrying > > > > 2015-02-02 13:19:17,518 DEBUG [c.c.c.CapacityManagerImpl] > > > > (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) > > > > VM state > > > > transitted from :Starting to Stopped with event: > > > > OperationFailedvm's > > > > original host id: null new host id: null host id before state > > > transition: 1 > > > > 2015-02-02 13:19:17,585 ERROR [c.c.v.VmWorkJobHandlerProxy] > > > > (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) > > > Invocation > > > > exception, caused by: > > > > com.cloud.exception.AgentUnavailableException: > > > > Resource [Host:1] is unreachable: Host 1: Unable to start > > > > instance due to > > > > Unable to start VM[DomainRouter|r-29-VM] due to error in > > > > finalizeStart, > > > not > > > > retrying > > > > 2015-02-02 13:19:17,585 INFO [c.c.v.VmWorkJobHandlerProxy] > > > > (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) > > > > Rethrow > > > > exception com.cloud.exception.AgentUnavailableException: Resource > > > [Host:1] > > > > is unreachable: Host 1: Unable to start instance due to Unable to > > > > start > > > > VM[DomainRouter|r-29-VM] due to error in finalizeStart, not > > > > retrying > > > > 2015-02-02 13:19:17,586 ERROR [c.c.v.VmWorkJobDispatcher] > > > > (Work-Job-Executor-24:ctx-114e980e job-244/job-245) Unable to > > > > complete > > > > AsyncJobVO {id:245, userId: 2, accountId: 2, instanceType: null, > > > > instanceId: null, cmd: com.cloud.vm.VmWorkStart, cmdInfo: > > > > > > > > rO0ABXNyABhjb20uY2xvdWQudm0uVm1Xb3JrU3RhcnR9cMGsvxz73gIAC0oABGRjSWRMAAZhd= m9pZHN0ADBMY29tL2Nsb3VkL2RlcGxveS9EZXBsb3ltZW50UGxhbm5lciRFeGNsdWRlTGlzdDtM= AAljbHVzdGVySWR0ABBMamF2YS9sYW5nL0xvbmc7TAAGaG9zdElkcQB-AAJMAAtqb3VybmFsTmF= tZXQAEkxqYXZhL2xhbmcvU3RyaW5nO0wAEXBoeXNpY2FsTmV0d29ya0lkcQB-AAJMAAdwbGFubm= VycQB-AANMAAVwb2RJZHEAfgACTAAGcG9vbElkcQB-AAJMAAlyYXdQYXJhbXN0AA9MamF2YS91d= GlsL01hcDtMAA1yZXNlcnZhdGlvbklkcQB-AAN4cgATY29tLmNsb3VkLnZtLlZtV29ya5-Ztlbw= JWdrAgAESgAJYWNjb3VudElkSgAGdXNlcklkSgAEdm1JZEwAC2hhbmRsZXJOYW1lcQB-AAN4cAA= AAAAAAAACAAAAAAAAAAIAAAAAAAAAHXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAA= AAAHBwcHBwcHBwc3IAEWphdmEudXRpbC5IYXNoTWFwBQfawcMWYNEDAAJGAApsb2FkRmFjdG9yS= QAJdGhyZXNob2xkeHA_QAAAAAAADHcIAAAAEAAAAAF0AA5SZXN0YXJ0TmV0d29ya3QAP3JPMEFC= WE55QUJGcVlYWmhMbXhoYm1jdVFtOXZiR1ZoYnMwZ2NvRFZuUHJ1QWdBQldnQUZkbUZzZFdWNGN= BRXhw, > > > > cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: > > > > 0, > > > > result: null, initMsid: 161333667508, completeMsid: null, > > > > lastUpdated: > > > > null, lastPolled: null, created: Mon Feb 02 12:58:55 EST 2015}, > > > > job > > > > origin:244 > > > > > > > > > > > > *com.cloud.exception.AgentUnavailableException: Resource [Host:1] > > > > is > > > > unreachable: Host 1: Unable to start instance due to Unable to > > > > start > > > > VM[DomainRouter|r-29-VM] due to error in finalizeStart, not > > > retryingCaused > > > > by: com.cloud.utils.exception.ExecutionException: Unable to start > > > > VM[DomainRouter|r-29-VM] due to error in finalizeStart, not > > > > retrying* > > > > > > > > 2015-02-02 13:19:17,930 WARN [o.a.c.e.o.NetworkOrchestrator] > > > > (API-Job-Executor-16:ctx-0dfa85ec job-244 ctx-2d8a3616) Failed to > > > implement > > > > network Ntwk[f3a318a2-d6f0-4fcb-be94-4e4586cc20a3|Guest|7] > > > > elements and > > > > resources as a part of network restart due to > > > > java.lang.RuntimeException: *Job failed due to exception Resource > > > [Host:1] > > > > is unreachable: Host 1: Unable to start instance due to Unable to > > > > start > > > > VM[DomainRouter|r-29-VM] due to error in finalizeStart, not > > > > retrying* > > > > 2015-02-02 13:19:17,930 WARN [c.c.n.NetworkServiceImpl] > > > > (API-Job-Executor-16:ctx-0dfa85ec job-244 ctx-2d8a3616) Network > > > > id=3D207 > > > > failed to restart. > > > > 2015-02-02 13:19:18,135 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] > > > > (API-Job-Executor-16:ctx-0dfa85ec job-244) Complete async > > > > job-244, > > > > jobStatus: FAILED, resultCode: 530, result: > > > > > > > > org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],= "errorcode":530,"errortext":"Failed > > > > to restart network"} > > > > 2015-02-02 13:23:18,345 WARN > > > > [c.c.a.d.ParamGenericValidationWorker] > > > > (catalina-exec-19:ctx-b153f3e0 ctx-d5075366) Received unknown > > > > parameters > > > > for command listNetworks. Unknown parameters : details > > > > > > > > On Mon, Feb 2, 2015 at 2:19 PM, Andrei Mikhailovsky > > > > > > > > wrote: > > > > > > > > > Mohammad, what does the management server log say when you try > > > > > to start > > > > > VRs? It should have the clue why it is not starting > > > > > > > > > > Andrei > > > > > > > > > > ----- Original Message ----- > > > > > > > > > > > From: "Mohammad Rastgoo" > > > > > > To: users@cloudstack.apache.org > > > > > > Sent: Monday, 2 February, 2015 6:06:41 PM > > > > > > Subject: Virtual Routers not starting up after host restart > > > > > > > > > > > Hi, > > > > > > > > > > > Thanks for reading this. > > > > > > > > > > > I have this setup: > > > > > > server 1: MS + DB > > > > > > server 2: secondary storage NFS > > > > > > server 3: kvm - local primary > > > > > > (all centos 6.6) > > > > > > net1: isolated network 10.0.0.0/x > > > > > > net2: shared network (public ip) > > > > > > > > > > > Here are the steps I took: > > > > > > > > > > > 1- stopped all VMs > > > > > > 2- stopped system VMs (not VRs) > > > > > > 3- yum updated glibc + reboot on all servers > > > > > > > > > > > Now here is the situation, net2 has remained in setup state > > > > > > and net1 > > > > > > on > > > > > > allocated. > > > > > > > > > > > sys VMs are back on. VRs are at starting and then stopped. > > > > > > > > > > > so far, I have deleted VRs and restarted networks + clean up. > > > > > > no > > > > > > luck. > > > > > > > > > > > has anyone encountered the same problem? am I missing > > > > > > anything here? > > > > > > > > > > > Any help is highly appreciated. Tnx > > > > > > > > > > > -- > > > > > > Mohammad Rastgoo > > > > > > > > > > > > > > > > > > > > > -- > > > > Mohammad Rastgoo > > > > Founder & CEO > > > > > > > > > > -- > > Mohammad Rastgoo > > Founder & CEO > --=20 Mohammad Rastgoo Founder & CEO --089e010d8a766449e7050e34d919--