Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 5A608200B73 for ; Mon, 29 Aug 2016 16:41:06 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 58D30160AB8; Mon, 29 Aug 2016 14:41:06 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 78B72160A89 for ; Mon, 29 Aug 2016 16:41:04 +0200 (CEST) Received: (qmail 61541 invoked by uid 500); 29 Aug 2016 14:41:03 -0000 Mailing-List: contact dev-help@cloudstack.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cloudstack.apache.org Delivered-To: mailing list dev@cloudstack.apache.org Received: (qmail 61525 invoked by uid 99); 29 Aug 2016 14:41:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Aug 2016 14:41:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 8A2B1185F31 for ; Mon, 29 Aug 2016 14:41:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.323 X-Spam-Level: ** X-Spam-Status: No, score=2.323 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=2, KAM_EU=0.5, KAM_HTMLNOISE=1, KAM_LOTSOFHASH=0.25, RP_MATCHES_RCVD=-1.426, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id yvs7lfLI4bwR for ; Mon, 29 Aug 2016 14:40:47 +0000 (UTC) Received: from smtp01.senselan.ch (smtp01.senselan.ch [194.153.189.17]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 4C29C5F251 for ; Mon, 29 Aug 2016 14:40:46 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp01.senselan.ch (Postfix) with ESMTP id 8E02DA09EB for ; Mon, 29 Aug 2016 16:40:39 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at smtp01.senselan.ch Received: from smtp01.senselan.ch ([127.0.0.1]) by localhost (smtp01.senselan.ch [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EwLDdZ0_XmHo for ; Mon, 29 Aug 2016 16:40:34 +0200 (CEST) Received: from [192.168.200.121] (support.senselan.ch [83.222.128.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp01.senselan.ch (Postfix) with ESMTPSA id 5DF2AA001C for ; Mon, 29 Aug 2016 16:40:34 +0200 (CEST) Subject: Re: CS 4.9 NIO Selector wait time PR-1601 References: <57BD7D62.6090302@gmail.com> <57BDADDE.6030408@senselan.ch> <57BEC2A1.1010705@senselan.ch> <57BED952.8000405@senselan.ch> <57BF0CA8.9000200@senselan.ch> <57C05502.2010408@senselan.ch> <57C05ADA.304@senselan.ch> <57C3EDC0.9060002@senselan.ch> From: martin kolly To: dev@cloudstack.apache.org Message-ID: <57C44960.3040702@senselan.ch> Date: Mon, 29 Aug 2016 16:40:32 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Icedove/38.5.0 MIME-Version: 1.0 In-Reply-To: <57C3EDC0.9060002@senselan.ch> Content-Type: multipart/alternative; boundary="------------050908080306030309080506" archived-at: Mon, 29 Aug 2016 14:41:06 -0000 --------------050908080306030309080506 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit We have done more investigations. An error happens when the script /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh is executed: MGMT LOGS: 2016-08-29 16:00:11,342 DEBUG [c.c.a.t.Request] (AgentManager-Handler-12:null) (logid:) Seq 8-1837187172990452475: Processing: { Ans: , MgmtId: 90520741415395, via: 8, Ver: v1, Flags: 10, [{"com.cloud.agent.api.GetRouterAlertsAnswer":{"result":false,"details":"/usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh: 1: /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh: h!/bin/bash: not found","wait":0}}] } 2016-08-29 16:00:11,345 DEBUG [c.c.a.t.Request] (RouterStatusMonitor-1:ctx-9cdc1e33) (logid:5f04bf82) Seq 8-1837187172990452475: Received: { Ans: , MgmtId: 90520741415395, via: 8(kvm702), Ver: v1, Flags: 10, { GetRouterAlertsAnswer } } *2016-08-29 16:00:11,345 DEBUG [c.c.a.m.AgentManagerImpl] (RouterStatusMonitor-1:ctx-9cdc1e33) (logid:5f04bf82) Details from executing class com.cloud.agent.api.routing.GetRouterAlertsCommand: /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh: 1: /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh: h!/bin/bash: not found* 2016-08-29 16:00:11,345 WARN [c.c.n.r.VirtualNetworkApplianceManagerImpl] (RouterStatusMonitor-1:ctx-9cdc1e33) (logid:5f04bf82) Unable to get alerts from router r-221-VM /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh: 1: /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh: h!/bin/bash: not found 2016-08-29 16:00:11,411 INFO [o.a.c.e.o.NetworkOrchestrator] (Network-Scavenger-1:ctx-64537bea) (logid:0c5d8104) NetworkGarbageCollector uses '600' seconds for GC interval. 2016-08-29 16:00:11,416 DEBUG [o.a.c.e.o.NetworkOrchestrator] (Network-Scavenger-1:ctx-64537bea) (logid:0c5d8104) We found network 219 to be free for the first time. Adding it to the list: 1472479211 2016-08-29 16:00:12,392 DEBUG [c.c.s.StatsCollector] (StatsCollector-6:ctx-c1aeb528) (logid:a364cc05) StorageCollector is running... 2016-08-29 16:00:12,403 DEBUG [c.c.h.o.r.Ovm3HypervisorGuru] (StatsCollector-6:ctx-c1aeb528) (logid:a364cc05) getCommandHostDelegation: class com.cloud.agent.api.GetStorageStatsCommand For debugging we modified the "router_proxy.sh" slightly on the KVM Host: kvm# grep -A 2 DEBUG /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh -n 46:# DEBUG CS Upgrade *47-echo "$(date): ssh -p 3922 -q -o StrictHostKeyChecking=no -i $cert root@$domRIp /opt/cloud/bin/$script $*" >> /tmp/router.txt* 48-ssh -p 3922 -q -o StrictHostKeyChecking=no -i $cert root@$domRIp "/opt/cloud/bin/$script $*" kvm# tail /tmp/router.txt Mon Aug 29 16:10:11 CEST 2016: ssh -p 3922 -q -o StrictHostKeyChecking=no -i /root/.ssh/id_rsa.cloud root@169.254.2.67 /opt/cloud/bin/netusage.sh -g Mon Aug 29 16:10:11 CEST 2016: ssh -p 3922 -q -o StrictHostKeyChecking=no -i /root/.ssh/id_rsa.cloud root@169.254.2.67 /opt/cloud/bin/netusage.sh -g On the router with IP 169.254.2.67 we see the failed logins as well: root@r-221-VM:~# tail -n 2 /var/log/auth.log Aug 29 14:10:11 r-221-VM sshd[7074]: Connection closed by 169.254.0.1 [preauth] Aug 29 14:10:11 r-221-VM sshd[7075]: Connection closed by 169.254.0.1 [preauth] These commands did not succeed. If we execute the command manually we get prompted for the key passphrase. kvm# ssh -p 3922 -o StrictHostKeyChecking=no -i /root/.ssh/id_rsa.cloud root@169.254.2.67 Enter passphrase for key '/root/.ssh/id_rsa.cloud': can someone confirm that the rsa key is protected with a passhrase? this looks suspicious to us... regards martin On 08/29/2016 10:09 AM, martin kolly wrote: > thanks Simon and Rohit for the valuable inputs! After applying the > following procedure the ssl errors are gone and host state is UP. > > # service cloudstack-management stop > > mysql> delete from configuration where name = "ssl.keystore" ; > # mv /etc/cloudstack/management/cloudmanagementserver.keystore > /etc/cloudstack/management/cloudmanagementserver.keystore.old > > # service cloudstack-management start > > # file /etc/cloudstack/management/cloudmanagementserver.keystore > /etc/cloudstack/management/cloudmanagementserver.keystore: Java KeyStore > # file /etc/cloudstack/management/cloudmanagementserver.keystore.old > /etc/cloudstack/management/cloudmanagementserver.keystore.old: data > > #java -version > java version "1.7.0_111" > > However new routers are not deployed, we still see entries that KVM > hosts are unreachable (logs underneath). > > After starting a new router it is shown with "virsh list" on the KVM: > kvm# virsh list | grep r-233-VM > 111 r-233-VM running > > After some seconds the router is deleted, > > kvm# virsh list | grep r-233-VM > # > > kvm#virsh -v > 1.2.2 > > (libvirt was already restarted) > > > > ############ Logs/Info KVM ################# > > kvm # dpkg -l | grep cloudstack > ii cloudstack-agent > 4.9.0 all CloudStack agent > ii cloudstack-common > 4.9.0 all A common package > which contains files which are shared by several CloudStack packages > > kvm# df -kh | grep cloud > 10.100.12.9:/export/cloud 188G 137G 51G 73% > /mnt/5db02c19-1e8f-3591-bdb4-02608362521e > > kvm# tail -f /var/log/cloudstack/agent/agent.log > ... > 2016-08-29 06:55:06.203+0000: 4425: error : > qemuMonitorFindBalloonObjectPath:1032 : internal error: Cannot determine > balloon device path > 2016-08-29 06:55:06.223+0000: 4426: error : > qemuMonitorFindBalloonObjectPath:1032 : internal error: Cannot determine > balloon device path > 2016-08-29 06:55:24.062+0000: 4425: warning : qemuDomainObjTaint:1628 : > Domain id=112 name='r-233-VM' uuid=d9dcd37a-242d-43ac-a18e-79a4bfa86ebb > is tainted: high-privileges > 2016-08-29 06:55:49.066+0000: 4423: error : qemuMonitorIO:656 : internal > error: End of file from monitor > > > > > ############ Logs/Info Management Server ######## > > # dpkg -l | grep cloudstack > ii cloudstack-agent > 4.9.0 all CloudStack agent > ii cloudstack-common > 4.9.0 all A common package > which contains files which are shared by several CloudStack packages > ii cloudstack-management > 4.9.0 all CloudStack server library > ii cloudstack-usage > 4.9.0 all CloudStack usage monitor > > # mysql -u root cloud -e "select id,name,path from cloud.storage_pool > where pool_type='Filesystem'" > +----+----------------------+-------------------------+ > | id | name | path | > +----+----------------------+-------------------------+ > | 1 | kvm704 Local Storage | /var/lib/libvirt/images | > | 2 | kvm701 Local Storage | /var/lib/libvirt/images | > | 3 | kvm702 Local Storage | /var/lib/libvirt/images | > | 4 | kvm703 Local Storage | /var/lib/libvirt/images | > +----+----------------------+-------------------------+ > > #tail -f /var/log/cloudstack/management/management-server.log | grep -v > DEBUG > ... > 2016-08-29 08:55:42,484 WARN [o.a.c.alerts] > (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739 ctx-68467c7a) > (logid:337b354e) alertType:: 9 // dataCenterId:: 1 // podId:: 1 // > clusterId:: null // message:: Command: > com.cloud.agent.api.GetDomRVersionCommand failed while starting virtual > router > 2016-08-29 08:55:42,494 ERROR > [c.c.n.r.VirtualNetworkApplianceManagerImpl] > (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739 ctx-68467c7a) > (logid:337b354e) GetDomRVersionCmd failed > 2016-08-29 08:55:42,495 WARN > [c.c.n.r.VirtualNetworkApplianceManagerImpl] > (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739 ctx-68467c7a) > (logid:337b354e) Command: com.cloud.agent.api.GetDomRVersionCommand > failed while starting virtual router > 2016-08-29 08:55:42,495 INFO [c.c.v.VirtualMachineManagerImpl] > (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739 ctx-68467c7a) > (logid:337b354e) The guru did not like the answers so stopping > VM[DomainRouter|r-233-VM] > 2016-08-29 08:55:49,184 INFO [o.a.c.f.j.i.AsyncJobManagerImpl] > (AsyncJobMgr-Heartbeat-1:ctx-5bd7c179) (logid:86c0376e) Begin cleanup > expired async-jobs > 2016-08-29 08:55:49,192 INFO [o.a.c.f.j.i.AsyncJobManagerImpl] > (AsyncJobMgr-Heartbeat-1:ctx-5bd7c179) (logid:86c0376e) End cleanup > expired async-jobs > 2016-08-29 08:55:50,861 ERROR [c.c.v.VirtualMachineManagerImpl] > (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739 ctx-68467c7a) > (logid:337b354e) Failed to start instance VM[DomainRouter|r-233-VM] > com.cloud.utils.exception.ExecutionException: Unable to start > VM:d9dcd37a-242d-43ac-a18e-79a4bfa86ebb due to error in finalizeStart, > not retrying > at > com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:1084) > at > com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:4592) > at sun.reflect.GeneratedMethodAccessor290.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107) > at > com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4753) > at com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:102) > at > org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:554) > at > org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) > at > org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) > at > org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:502) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-08-29 08:55:51,136 ERROR [c.c.v.VmWorkJobHandlerProxy] > (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739 ctx-68467c7a) > (logid:337b354e) Invocation exception, caused by: > com.cloud.exception.AgentUnavailableException: Resource [Host:8] is > unreachable: Host 8: Unable to start instance due to Unable to start > VM:d9dcd37a-242d-43ac-a18e-79a4bfa86ebb due to error in finalizeStart, > not retrying > 2016-08-29 08:55:51,136 INFO [c.c.v.VmWorkJobHandlerProxy] > (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739 ctx-68467c7a) > (logid:337b354e) Rethrow exception > com.cloud.exception.AgentUnavailableException: Resource [Host:8] is > unreachable: Host 8: Unable to start instance due to Unable to start > VM:d9dcd37a-242d-43ac-a18e-79a4bfa86ebb due to error in finalizeStart, > not retrying > 2016-08-29 08:55:51,137 ERROR [c.c.v.VmWorkJobDispatcher] > (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739) (logid:337b354e) > Unable to complete AsyncJobVO {id:2739, userId: 2, accountId: 2, > instanceType: null, instanceId: null, cmd: com.cloud.vm.VmWorkStart, > cmdInfo: > rO0ABXNyABhjb20uY2xvdWQudm0uVm1Xb3JrU3RhcnR9cMGsvxz73gIAC0oABGRjSWRMAAZhdm9pZHN0ADBMY29tL2Nsb3VkL2RlcGxveS9EZXBsb3ltZW50UGxhbm5lciRFeGNsdWRlTGlzdDtMAAljbHVzdGVySWR0ABBMamF2YS9sYW5nL0xvbmc7TAAGaG9zdElkcQB-AAJMAAtqb3VybmFsTmFtZXQAEkxqYXZhL2xhbmcvU3RyaW5nO0wAEXBoeXNpY2FsTmV0d29ya0lkcQB-AAJMAAdwbGFubmVycQB-AANMAAVwb2RJZHEAfgACTAAGcG9vbElkcQB-AAJMAAlyYXdQYXJhbXN0AA9MamF2YS91dGlsL01hcDtMAA1yZXNlcnZhdGlvbklkcQB-AAN4cgATY29tLmNsb3VkLnZtLlZtV29ya5-ZtlbwJWdrAgAESgAJYWNjb3VudElkSgAGdXNlcklkSgAEdm1JZEwAC2hhbmRsZXJOYW1lcQB-AAN4cAAAAAAAAAACAAAAAAAAAAIAAAAAAAAA6XQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAAHBwcHBwcHBwc3IAEWphdmEudXRpbC5IYXNoTWFwBQfawcMWYNEDAAJGAApsb2FkRmFjdG9ySQAJdGhyZXNob2xkeHA_QAAAAAAADHcIAAAAEAAAAAF0AA5SZXN0YXJ0TmV0d29ya3QAP3JPMEFCWE55QUJGcVlYWmhMbXhoYm1jdVFtOXZiR1ZoYnMwZ2NvRFZuUHJ1QWdBQldnQUZkbUZzZFdWNGNBRXhw, > cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, > result: null, initMsid: 90520741415395, completeMsid: null, lastUpdated: > null, lastPolled: null, created: Mon Aug 29 08:55:21 CEST 2016}, job > origin:2738 > com.cloud.exception.AgentUnavailableException: Resource [Host:8] is > unreachable: Host 8: Unable to start instance due to Unable to start > VM:d9dcd37a-242d-43ac-a18e-79a4bfa86ebb due to error in finalizeStart, > not retrying > at > com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:1120) > at > com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:4592) > at sun.reflect.GeneratedMethodAccessor290.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107) > at > com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4753) > at com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:102) > at > org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:554) > at > org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) > at > org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) > at > org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:502) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: com.cloud.utils.exception.ExecutionException: Unable to > start VM:d9dcd37a-242d-43ac-a18e-79a4bfa86ebb due to error in > finalizeStart, not retrying > at > com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:1084) > at > com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:4592) > at sun.reflect.GeneratedMethodAccessor290.invoke(Unknown Source) > ... 17 more > 2016-08-29 08:55:51,168 INFO [o.a.c.f.j.i.AsyncJobMonitor] > (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739) (logid:337b354e) > Remove job-2739 from job monitoring > 2016-08-29 08:55:53,197 INFO [o.a.c.f.j.i.AsyncJobMonitor] > (Work-Job-Executor-113:ctx-6411b64b job-2738/job-2740) (logid:db064d99) > Add job-2740 into job monitoring > 2016-08-29 08:55:53,218 INFO [o.a.c.f.j.i.AsyncJobMonitor] > (Work-Job-Executor-113:ctx-6411b64b job-2738/job-2740) (logid:337b354e) > Remove job-2740 from job monitoring > 2016-08-29 08:55:53,247 ERROR [c.c.a.ApiAsyncJobDispatcher] > (API-Job-Executor-68:ctx-d2eb7557 job-2738) (logid:337b354e) Unexpected > exception while executing > org.apache.cloudstack.api.command.admin.router.StartRouterCmd > com.cloud.exception.AgentUnavailableException: Resource [Host:8] is > unreachable: Host 8: Unable to start instance due to Unable to start > VM:d9dcd37a-242d-43ac-a18e-79a4bfa86ebb due to error in finalizeStart, > not retrying > at > com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:1120) > at > com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:4592) > at sun.reflect.GeneratedMethodAccessor290.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107) > at > com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4753) > at com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:102) > at > org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:554) > at > org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) > at > org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) > at > org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:502) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: com.cloud.utils.exception.ExecutionException: Unable to > start VM:d9dcd37a-242d-43ac-a18e-79a4bfa86ebb due to error in > finalizeStart, not retrying > at > com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:1084) > at > com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:4592) > at sun.reflect.GeneratedMethodAccessor290.invoke(Unknown Source) > ... 17 more > 2016-08-29 08:55:53,256 INFO [o.a.c.f.j.i.AsyncJobMonitor] > (API-Job-Executor-68:ctx-d2eb7557 job-2738) (logid:337b354e) Remove > job-2738 from job monitoring > > > > > regards > Martin > > > > > On 08/26/2016 08:47 PM, Rohit Yadav wrote: >> Hi Martin, >> >> >> I checked my openssl connect output and my management server on port 8250 was indeed returning valid certificates. On further investigation, when connections to management server (NioServer, handled by Link class that reads /etc/cloudstack/management/cloudmanagementserver.keystore) are received it uses a keystore file (/etc/cloudstack/management/cloudmanagementserver.keystore). Please check if that's valid, or backup your keystore file and do what Simon has suggested. >> >> >> >> For reference, here the output from my 4.9.0 mgmt server running on Ubuntu 14.04.4: >> >> >> $ openssl s_client -tls1 -connect 192.168.1.11:8250 >> CONNECTED(00000003) >> depth=0 C = Unknown, O = bluebox, OU = bluebox, CN = Cloudstack User >> verify error:num=18:self signed certificate >> verify return:1 >> depth=0 C = Unknown, O = bluebox, OU = bluebox, CN = Cloudstack User >> verify return:1 >> --- >> Certificate chain >> 0 s:/C=Unknown/O=bluebox/OU=bluebox/CN=Cloudstack User >> i:/C=Unknown/O=bluebox/OU=bluebox/CN=Cloudstack User >> --- >> Server certificate >> -----BEGIN CERTIFICATE----- >> MIIDPzCCAiegAwIBAgIEFHHX2DANBgkqhkiG9w0BAQsFADBQMRAwDgYDVQQGEwdV >> bmtub3duMRAwDgYDVQQKEwdibHVlYm94MRAwDgYDVQQLEwdibHVlYm94MRgwFgYD >> VQQDEw9DbG91ZHN0YWNrIFVzZXIwHhcNMTUwNTA2MTM0MjMzWhcNMjUwNTAzMTM0 >> MjMzWjBQMRAwDgYDVQQGEwdVbmtub3duMRAwDgYDVQQKEwdibHVlYm94MRAwDgYD >> VQQLEwdibHVlYm94MRgwFgYDVQQDEw9DbG91ZHN0YWNrIFVzZXIwggEiMA0GCSqG >> SIb3DQEBAQUAA4IBDwAwggEKAoIBAQC0wN8kfOJMzwlbrOnBj/jjvjjIwDVpYLtH >> WoKkNB+rzzKEUxaYwaQxe6E3M536ZuqcaJBIqcYPwTIkWyulvuHuJpSQak4VbuDV >> f7dqt5RacLFT0jUciqTvL5QDCrk0uNugKkWgEvNtokVGSBwLPVEcdcGWpku1EpeH >> vMYmpOkcWgbC8Z9D7QTlVw6oEWbPAtKr+gDrXdOFnpPPI45rteatIIgKm1Q6JjZM >> qrUKfqt7s8ts6ZgdAN2WmtieSsnUX1su9SJMYg2J8LK7UJeGqiNtE+g944GPqtnW >> aDldjharq54e79ug2ktxw29I3ulpRD/vgxwZmcPJrePUKwY91KEjAgMBAAGjITAf >> MB0GA1UdDgQWBBSUJvLY8RL/1fTVqj1rT8136da4yzANBgkqhkiG9w0BAQsFAAOC >> AQEAgDdFIgLvOH/UgRp2nnFUVcMp+uchSLj8CbCkukJBrUwrmJHp3Os+H1ggk8Vt >> j3conj06zJBNN/E0J8pcpagE1aR+l4R8WxF3g/Oc7bNyrUlkGSQ82vavg9sEkwHY >> eQY/4wj8CprICs9JilgZ6keeWNWgAW1goLZSzGVwz5eE0lPuc2Dg3laR5RsuTxie >> dgQhpbOx3UZun+dhuP5NUHc+KWyrNvSZNN8FruO602KWZwm0Hndl7RVbkNEd0kxq >> FhFK4Scc2HBrKMUrPTzO1nGCgR1gA015C2MFfmjeW49VTi95WnY8DDG2euUYAtpl >> lLVuNxJxq7eDJfP/M9kxSIKgrQ== >> -----END CERTIFICATE----- >> subject=/C=Unknown/O=bluebox/OU=bluebox/CN=Cloudstack User >> issuer=/C=Unknown/O=bluebox/OU=bluebox/CN=Cloudstack User >> --- >> No client certificate CA names sent >> Server Temp Key: ECDH, P-256, 256 bits >> --- >> SSL handshake has read 1325 bytes and written 331 bytes >> --- >> New, TLSv1/SSLv3, Cipher is ECDHE-RSA-AES256-SHA >> Server public key is 2048 bit >> Secure Renegotiation IS supported >> Compression: NONE >> Expansion: NONE >> No ALPN negotiated >> SSL-Session: >> Protocol : TLSv1 >> Cipher : ECDHE-RSA-AES256-SHA >> Session-ID: 57C08E5D841D9B74AB533A0000BDBE4275424640B5B7D1985D438B275598436F >> Session-ID-ctx: >> Master-Key: E1DE7D7464766DCECD82088561A7D928519CA20BA91FDF4AA38BB2E51AB7C27D3F1ABA5788ADED2AD49FE8597EC4A344 >> Key-Arg : None >> PSK identity: None >> PSK identity hint: None >> SRP username: None >> Start Time: 1472237149 >> Timeout : 7200 (sec) >> Verify return code: 18 (self signed certificate) >> --- >> >> >> >> >> Regards. >> >> ________________________________ >> From: martin kolly >> Sent: 26 August 2016 20:36:02 >> To: dev@cloudstack.apache.org >> Subject: Re: CS 4.9 NIO Selector wait time PR-1601 >> >> >> good point, thanks Simon! with openssl we receive a response on port 8250. >> >> # telnet 10.100.12.10 8250 >> Trying 10.100.12.10... >> Connected to 10.100.12.10. >> Escape character is '^]'. >> Connection closed by foreign host. >> >> # nc -zv 10.100.12.10 8250 >> Connection to 10.100.12.10 8250 port [tcp/*] succeeded! >> >> # openssl s_client -tls1 -connect 10.100.12.10:8250 >> CONNECTED(00000003) >> write:errno=104 >> --- >> no peer certificate available >> --- >> No client certificate CA names sent >> --- >> SSL handshake has read 0 bytes and written 0 bytes >> --- >> New, (NONE), Cipher is (NONE) >> Secure Renegotiation IS NOT supported >> Compression: NONE >> Expansion: NONE >> SSL-Session: >> Protocol : TLSv1 >> Cipher : 0000 >> Session-ID: >> Session-ID-ctx: >> Master-Key: >> Key-Arg : None >> PSK identity: None >> PSK identity hint: None >> SRP username: None >> Start Time: 1472223447 >> Timeout : 7200 (sec) >> Verify return code: 0 (ok) >> --- >> >> >> On 08/26/2016 04:49 PM, Simon Weller wrote: >>> Martin, >>> >>> >>> Are you able to actually telnet to 8250 from the host to the mgmt server? >>> >>> >>> - Si >>> >>> >>> ________________________________ >>> From: martin kolly >>> Sent: Friday, August 26, 2016 9:41 AM >>> To: dev@cloudstack.apache.org >>> Subject: Re: CS 4.9 NIO Selector wait time PR-1601 >>> >>> Hi Rohit >>> >>> We highly appreciate your efforts! Unfortunately it still does not work. >>> - ulimit is increased on mgmt server >>> - jar file replaced >>> - we confirm that cloudstack-agent 4.9.0 is installed >>> >>> MGMT Server >>> # wget https://github.com/rhtyd/cloudstack/releases/download/4.9.0-nioinbound/cloud-utils-4.9.0.jar -O cloud-utils-4.9.0.jar.patch >>> # md5sum cloud-utils-4.9.0.jar.patch >>> c4496f42cc6741f562ac645c3a3d8a0c cloud-utils-4.9.0.jar.patch >>> # md5sum /usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-utils-4.9.0.jar >>> c4496f42cc6741f562ac645c3a3d8a0c /usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-utils-4.9.0.jar >>> >>> # ulimit -a >>> core file size (blocks, -c) 0 >>> data seg size (kbytes, -d) unlimited >>> scheduling priority (-e) 0 >>> file size (blocks, -f) unlimited >>> pending signals (-i) 64109 >>> max locked memory (kbytes, -l) 64 >>> max memory size (kbytes, -m) unlimited >>> open files (-n) 10240 >>> pipe size (512 bytes, -p) 8 >>> POSIX message queues (bytes, -q) 819200 >>> real-time priority (-r) 0 >>> stack size (kbytes, -s) 8192 >>> cpu time (seconds, -t) unlimited >>> max user processes (-u) 64109 >>> virtual memory (kbytes, -v) unlimited >>> file locks (-x) unlimited >>> >>> KVM Server >>> # wget https://github.com/rhtyd/cloudstack/releases/download/4.9.0-nioinbound/cloud-utils-4.9.0.jar -O cloud-utils-4.9.0.jar.patch >>> # md5sum cloud-utils-4.9.0.jar.patch >>> c4496f42cc6741f562ac645c3a3d8a0c cloud-utils-4.9.0.jar.patch >>> # md5sum /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar >>> c4496f42cc6741f562ac645c3a3d8a0c /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar >>> >>> # apt-cache policy cloudstack-agent >>> cloudstack-agent: >>> Installed: 4.9.0 >>> Candidate: 4.9.0 >>> Version table: >>> *** 4.9.0 0 >>> 500 http://packages.shapeblue.com/cloudstack/upstream/debian/4.9/ ./ Packages >>> 100 /var/lib/dpkg/status >>> >>> The logs are attached. By the way: the error message "Caught the Exception in VmIpFetchTask" was already there with 4.8 release. >>> >>> Thanks >>> Martin >>> >>> On 08/25/2016 07:44 PM, Rohit Yadav wrote: >>> >>> Hi Martin, >>> >>> >>> Thanks for sharing. Alright, I'm not sure what's causing issue but based on the logs seems like only KVM agents are having issues while connecting to mgmt server as I don't see any Nio related exceptions in the management server logs. >>> >>> >>> I could not see the cloudstack-agent version in the logs, I'm assuming that they were all upgraded to 4.9.0, and there are no conflicting jars at /usr/share/cloudstack-agent/lib. >>> >>> >>> First, can you make sure mgmt server has enough ulimit. I found that Ubuntu/Debian's init.d script don't override this while CentOS initd/systemd script sets ulimit. On your mgmt server, edit /etc/init.d/cloudstack-management and add ulimit -n 10240 just before the mgmt server is started in the 'state' section (for me it was at around line #147 where it logs a message that it's starting the cloudstack-management server). >>> >>> >>> Next, if this still does not solve the issue -- I created a special cloud-utils.jar for you that you need to place on your mgmt server and on the KVM agents and restart the mgmt server. This will increase verbosity of the error while reduce the Nio polling loop timeout (from 100ms to 10ms). On KVM agents, the error from the logs is that during SSL handshake inbound connection/stream gets closed, and we want to know the exception message. Please get the jar from here: >>> >>> https://github.com/rhtyd/cloudstack/releases/tag/4.9.0-nioinbound and place them at: >>> >>> /usr/share/cloudstack-agent/lib/ (on kvm host) >>> >>> /usr/share/cloudstack-management/webapps/client/WEB-INF/lib/ (on mgmt server host) >>> >>> >>> Let me know what worked for you, and if it still failed can you share the mgmt server and agent logs once again. Thanks. >>> >>> >>> Regards. >>> >>> ________________________________ >>> From: martin kolly >>> Sent: 25 August 2016 20:50:08 >>> To: dev@cloudstack.apache.org >>> Subject: Re: CS 4.9 NIO Selector wait time PR-1601 >>> >>> Hi Rohit >>> >>> We are running java version 1.7.0.111 on KVM and management server. >>> mgmt# java -version >>> java version "1.7.0_111" >>> kvm# java -version >>> java version "1.7.0_111" >>> >>> We get the same error message. Attached are the logs with TRACE enabled. >>> >>> "success consists of going from failure to failure without loss of enthusiasm." >>> >>> regards >>> martin >>> >>> On 08/25/2016 02:02 PM, Rohit Yadav wrote: >>> >>> Hi Martin, >>> >>> >>> Thanks for sharing, on the surface there does not seem to be any issue in configuration causing the failures. I'm personally running KVM and Ubuntu hosts based env without issues, I'm on Ubuntu 14.04.4 (Linux bluebox 3.16.0-45-generic #60~14.04.1-Ubuntu) and java 1.7.0_79. Can you try upgrading your JRE7 to latest (openjdk-7-jre, 7u111-2.6.7-0ubuntu0.14.04.3) on all mgmt server and kvm hosts? >>> >>> >>> If upgrading your JRE does not help, can you increase the logging verbosity for both the agent and management server (in /etc/cloudstack/{agent, management} there would be a log4j file, edit that and replace DEBUG/INFO with TRACE for class/keys com.cloud and org.apache.cloudstack) and re-share logs when the failures occur? I want to see what additional information we can get from logs when it tries to connect to host 10.100.12.10 on port: 8250. >>> >>> >>> Regards. >>> >>> ________________________________ >>> From: martin kolly >>> Sent: 25 August 2016 17:11:06 >>> To: dev@cloudstack.apache.org >>> Subject: Re: CS 4.9 NIO Selector wait time PR-1601 >>> >>> >>> @Simon: We have one management server with local DB. KVMs connect >>> directly to the management server without any security/loadbalancing >>> device. >>> >>> Thanks >>> Martin >>> >>> On 08/25/2016 12:41 PM, Simon Weller wrote: >>> >>> >>> Martin, >>> >>> Can you provide more detail about your haproxy setup? >>> Are you running it on separate servers, or on the management server itself? >>> >>> - Si >>> >>> Simon Weller/ENA >>> (615) 312-6068 >>> >>> rohit.yadav@shapeblue.com >>> www.shapeblue.com >>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK >>> @shapeblue >>> >>> >>> >>> >>> -----Original Message----- >>> From: martin kolly [martin.kolly@senselan.ch] >>> Received: Thursday, 25 Aug 2016, 5:04AM >>> To: Rohit Yadav [rohit.yadav@shapeblue.com]; dev@cloudstack.apache.org [dev@cloudstack.apache.org] >>> Subject: Re: CS 4.9 NIO Selector wait time PR-1601 >>> >>> >>> thanks for your reply. >>> >>> This morning we repeated the upgrade process from 4.8 to 4.9 with the >>> following repository: >>> http://packages.shapeblue.com/cloudstack/upstream/debian/4.9/. >>> >>> Unfortunately we run into the same issue: >>> >>> /2016-08-25 09:49:00,660 INFO [utils.nio.NioClient] (main:null) >>> (logid:) Connecting to 10.100.12.10:8250// >>> //2016-08-25 09:49:00,668 WARN [utils.nio.Link] (main:null) (logid:) >>> This SSL engine was forced to close inbound due to end of stream.// >>> //2016-08-25 09:49:00,668 ERROR [utils.nio.NioClient] (main:null) >>> (logid:) SSL Handshake failed while connecting to host: 10.100.12.10 >>> port: 8250// >>> //2016-08-25 09:49:00,668 ERROR [utils.nio.NioConnection] (main:null) >>> (logid:) Unable to initialize the threads.// >>> //java.io.IOException: SSL Handshake failed while connecting to host: >>> 10.100.12.10 port: 8250// >>> // at com.cloud.utils.nio.NioClient.init(NioClient.java:67)// >>> // at com.cloud.utils.nio.NioConnection.start(NioConnection.java:88)// >>> // at com.cloud.agent.Agent.start(Agent.java:237)// >>> // at com.cloud.agent.AgentShell.launchAgent(AgentShell.java:399)// >>> // at >>> com.cloud.agent.AgentShell.launchAgentFromClassInfo(AgentShell.java:367)// >>> // at com.cloud.agent.AgentShell.launchAgent(AgentShell.java:351)// >>> // at com.cloud.agent.AgentShell.start(AgentShell.java:456)// >>> // at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)// >>> // at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)// >>> // at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)// >>> // at java.lang.reflect.Method.invoke(Method.java:606)// >>> // at >>> org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243)// >>> //2016-08-25 09:49:00,669 INFO [utils.exception.CSExceptionErrorCode] >>> (main:null) (logid:) Could not find exception: >>> com.cloud.utils.exception.NioConnectionException in error code list for >>> exceptions// >>> //2016-08-25 09:49:00,669 WARN [cloud.agent.Agent] (main:null) (logid:) >>> NIO Connection Exception >>> com.cloud.utils.exception.NioConnectionException: SSL Handshake failed >>> while connecting to host: 10.100.12.10 port: 8250// >>> //2016-08-25 09:49:00,670 INFO [cloud.agent.Agent] (main:null) (logid:) >>> Attempted to connect to the server, but received an unexpected >>> exception, trying again.../ >>> >>> *KVM Hosts: >>> */# java -version >>> java version "1.7.0_95" >>> OpenJDK Runtime Environment (IcedTea 2.6.4) (7u95-2.6.4-0ubuntu0.14.04.1) >>> OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode) >>> >>> # dpkg --get-selections | grep -e 'jdk' -e 'java' >>> ca-certificates-java install >>> java-common install >>> libcommons-daemon-java install >>> openjdk-7-jre-headless:amd64 install >>> tzdata-java install >>> >>> # apt-cache policy cloudstack-agent >>> cloudstack-agent: >>> Installed: 4.9.0 >>> Candidate: 4.9.0 >>> Version table: >>> *** 4.9.0 0 >>> 500 >>> http://packages.shapeblue.com/cloudstack/upstream/debian/4.9/ ./ Packages >>> 100 /var/lib/dpkg/status >>> >>> # find /usr/share/ -name "cloud-utils*.jar" >>> /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar >>> # md5sum /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar >>> a8de7306d7c80b5a73e93b83afdd119f >>> /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar >>> >>> >>> /*Management Server: >>> */# java -version// >>> //java version "1.7.0_95"// >>> //OpenJDK Runtime Environment (IcedTea 2.6.4) >>> (7u95-2.6.4-0ubuntu0.14.04.1)// >>> //OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode)// >>> // >>> //# dpkg --get-selections | grep -e 'jdk' -e 'java'// >>> //ca-certificates-java install// >>> //java-common install// >>> //libcommons-collections3-java install// >>> //libcommons-daemon-java install// >>> //libcommons-dbcp-java install// >>> //libcommons-pool-java install// >>> //libecj-java install// >>> //libgeronimo-jta-1.1-spec-java install// >>> //libmysql-java install// >>> //libservlet2.5-java install// >>> //libtomcat6-java install// >>> //openjdk-7-jre-headless:amd64 install// >>> //tzdata-java install// >>> // >>> //# apt-cache policy cloudstack-management// >>> //cloudstack-management:// >>> // Installed: 4.9.0// >>> // Candidate: 4.9.0// >>> // Version table:// >>> // *** 4.9.0 0// >>> // 500 >>> http://packages.shapeblue.com/cloudstack/upstream/debian/4.9/ ./ Packages// >>> // 100 /var/lib/dpkg/status/// >>> >>> /# find /usr/share/ -name "cloud-utils*.jar"// >>> ///usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-utils-4.9.0.jar// >>> ///usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar// >>> ///usr/share/cloudstack-usage/lib/cloud-utils-4.9.0.jar// >>> //# md5sum >>> /usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-utils-4.9.0.jar// >>> //a8de7306d7c80b5a73e93b83afdd119f >>> /usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-utils-4.9.0.jar// >>> //# md5sum /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar// >>> //a8de7306d7c80b5a73e93b83afdd119f >>> /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar// >>> //# md5sum /usr/share/cloudstack-usage/lib/cloud-utils-4.9.0.jar// >>> //a8de7306d7c80b5a73e93b83afdd119f >>> /usr/share/cloudstack-usage/lib/cloud-utils-4.9.0.jar/ >>> >>> The classpath.conf was not modified: >>> /# cat /etc/cloudstack/management/classpath.conf >>> #!/bin/bash >>> #... >>> >>> SYSTEMJARS="" >>> SCP=$(build-classpath $SYSTEMJARS 2>/dev/null) ; if [ $? != 0 ] ; then >>> export SCP="" ; fi >>> MCP="" >>> DCP="/usr/share/tomcat6/bin/bootstrap.jar:/usr/share/tomcat6/bin/tomcat-juli.jar" >>> CLASSPATH=$SCP:$DCP:$MCP:/etc/cloudstack/management:/usr/share/cloudstack-management/setup >>> for jarfile in ""/* ; do >>> if [ ! -e "$jarfile" ] ; then continue ; fi >>> CLASSPATH=$jarfile:$CLASSPATH >>> done >>> for plugin in ""/* ; do >>> if [ ! -e "$plugin" ] ; then continue ; fi >>> CLASSPATH=$plugin:$CLASSPATH >>> done >>> for vendorconf in "/etc/cloudstack/management"/vendor/* ; do >>> if [ ! -d "$vendorconf" ] ; then continue ; fi >>> CLASSPATH=$vendorconf:$CLASSPATH >>> done >>> export CLASSPATH >>> if ([ -z "$JAVA_HOME" ] || [ ! -d "$JAVA_HOME" ]) && [ -d >>> /usr/lib/jvm/jre-1.7.0 ]; then >>> export JAVA_HOME=/usr/lib/jvm/jre-1.7.0 >>> fi >>> PATH=$JAVA_HOME/bin:/sbin:/usr/sbin:$PATH >>> export PATH/ >>> >>> Regards >>> Martin >>> >>> On 08/24/2016 06:56 PM, Rohit Yadav wrote: >>> >>> >>> Martin, >>> >>> >>> Were you able to fix your issue after installing packages from the >>> repo Will shared and restarting the services? >>> >>> I've not personally tested the apt-get.eu repo, but I had earlier >>> built this repo which I'm personally using in my local KVM-trusty >>> based cloud: http://packages.shapeblue.com/cloudstack/upstream/debian/4.9/ >>> >>> >>> If you're still getting the error, can you share the JRE version >>> you're running, both on the mgmt server and on the KVM hosts? You can >>> run java -version, or share output of "dpkg --get-selections | grep -e >>> 'jdk' -e 'java'". Are you running CloudStack with any additional plugins? >>> >>> From the logs, looks like there are mixed jar files, >>> NioConnectionException class was not found -- something's wrong with >>> your installation. there must be a cloud-utils jar file make sure your >>> installation don't have multiple copies/versions of jars >>> (somewhere) in the in /usr/share/cloudstack-common and in >>> /usr/share/cloudstack-management/webapps/client/ paths: >>> >>> Could not find exception: >>> com.cloud.utils.exception.NioConnectionException in error code list for >>> exceptions >>> The error "Unable to initialize the threads." suggests, JVM was not >>> able to spawn threads. I would like to know your JRE version and any >>> other settings configured in /etc/cloudstack/management/classpath.conf >>> (and there are bunch of other files where JAVA_OPTS might have been >>> overridden). Note: For now you should only be using JRE1.7. >>> >>> >>> Regards. >>> >>> rohit.yadav@shapeblue.com >>> www.shapeblue.com >>> @shapeblue >>> >>> >>> >>> >>> ------------------------------------------------------------------------ >>> *From:* martin kolly >>> *Sent:* 24 August 2016 19:53:26 >>> *To:* dev@cloudstack.apache.org; Rohit Yadav >>> *Subject:* Re: CS 4.9 NIO Selector wait time PR-1601 >>> >>> Thanks Will! >>> >>> yes the repo is pointing to 4.9 release for all KVMs and for the >>> management server: >>> /cloudstack:~# cat /etc/apt/sources.list.d/cloudstack.list // >>> //deb http://cloudstack.apt-get.eu/ubuntu trusty 4.9/ >>> >>> All KVM agents and the mgmt server are upgraded to release 4.9 based >>> on the documentation.We have restarted all the cloudstack-agents and >>> the cloudstack-management service as well. >>> >>> Network traces are showing packets from KVM <-> Mgmt on port 8250. >>> there is no security device in between. >>> >>> thanks >>> fanfarlo >>> >>> >>> >>> >>> On 08/24/2016 04:13 PM, Will Stevens wrote: >>> >>> >>> @rohit, I am guessing they should be installing the cloudstack-agent using >>> the following repo right? That is what is described in the upgrade (trusty >>> instead of precise though). >>> >>> http://cloudstack.apt-get.eu/ubuntu/dists/trusty/4.9/ >>> >>> @fanfarlo, are your repo's setup to point to the new 4.9 version? >>> >>> cheers, >>> >>> will >>> >>> On Wed, Aug 24, 2016 at 9:46 AM, Rohit Yadav >>> wrote: >>> >>> >>> >>> The PR and fix already exists in 4.9.0 release. Please make sure to >>> upgrade all of your management server(s) and KVM agents and then also >>> restart them after the upgrade. >>> >>> >>> If you are seeing SSL handshake failures, it could be due to network or >>> security issue and most likely due to mismatch between CloudStack mgmt >>> server and KVM agent version. >>> >>> >>> Regards. >>> >>> rohit.yadav@shapeblue.com >>> www.shapeblue.com >>> @shapeblue >>> >>> >>> >>> ------------------------------ >>> *From:* Will Stevens >>> *Sent:* 24 August 2016 18:17:17 >>> *To:* dev@cloudstack.apache.org; Rohit Yadav >>> *Subject:* Re: CS 4.9 NIO Selector wait time PR-1601 >>> >>> >>> That PR is already merged, so you don't have to do anything to get that >>> code, you already have it. >>> >>> @rohit, can you review this? I think this is a similar to the issue Simon >>> reported earlier. >>> >>> Will >>> >>> On Aug 24, 2016 6:56 AM, "fanfarlo" wrote: >>> >>> >>> >>> hi all >>> >>> We have the following environment: >>> - OS: Debian 14.04 (hypervisors and management) >>> - 4 KVM Hosts >>> - Cloudstack Release 4.9 with local database >>> >>> Since we upgraded to Release 4.9 the KVM hosts no longer connect to the >>> management Server. Upgrade procedure was followed as described: >>> http://docs.cloudstack.apache.org/projects/cloudstack-releas >>> e-notes/en/4.9.0/upgrade/upgrade-4.8.html >>> >>> >>> On the KVM hosts we have the following error message: >>> /2016-08-24 10:42:49,678 INFO [utils.exception.CSExceptionErrorCode] >>> (main:null) (logid:) Could not find exception: >>> com.cloud.utils.exception.NioConnectionException in error code list for >>> exceptions >>> 2016-08-24 10:42:49,678 WARN [cloud.agent.Agent] (main:null) (logid:) >>> NIO Connection Exception >>> com.cloud.utils.exception.NioConnectionException: SSL Handshake failed >>> while connecting to host: 10.100.12.10 port: 8250 >>> 2016-08-24 10:42:49,678 INFO [cloud.agent.Agent] (main:null) (logid:) >>> Attempted to connect to the server, but received an unexpected >>> exception, trying again... >>> 2016-08-24 10:42:54,679 INFO [utils.nio.NioClient] (main:null) (logid:) >>> Connecting to 10.100.12.10:8250 >>> 2016-08-24 10:42:54,684 WARN [utils.nio.Link] (main:null) (logid:) This >>> SSL engine was forced to close inbound due to end of stream. >>> 2016-08-24 10:42:54,684 ERROR [utils.nio.NioClient] (main:null) (logid:) >>> SSL Handshake failed while connecting to host: 10.100.12.10 port: 8250 >>> 2016-08-24 10:42:54,685 ERROR [utils.nio.NioConnection] (main:null) >>> (logid:) Unable to initialize the threads. >>> java.io.IOException: SSL Handshake failed while connecting to host: >>> 10.100.12.10 port: 8250 >>> at com.cloud.utils.nio.NioClient.init(NioClient.java:67) >>> at com.cloud.utils.nio.NioConnection.start(NioConnection.java:88) >>> at com.cloud.agent.Agent.start(Agent.java:237) >>> at com.cloud.agent.AgentShell.launchAgent(AgentShell.java:399) >>> at >>> com.cloud.agent.AgentShell.launchAgentFromClassInfo(AgentShell.java:367) >>> at com.cloud.agent.AgentShell.launchAgent(AgentShell.java:351) >>> at com.cloud.agent.AgentShell.start(AgentShell.java:456) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce >>> ssorImpl.java:57) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe >>> thodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:606) >>> at >>> org.apache.commons.daemon.support.DaemonLoader.start(DaemonL >>> oader.java:243) >>> 2016-08-24 10:42:54,685 INFO [utils.exception.CSExceptionErrorCode] >>> (main:null) (logid:) Could not find exception: >>> com.cloud.utils.exception.NioConnectionException in error code list for >>> exceptions >>> 2016-08-24 10:42:54,685 WARN [cloud.agent.Agent] (main:null) (logid:) >>> NIO Connection Exception >>> com.cloud.utils.exception.NioConnectionException: SSL Handshake failed >>> while connecting to host: 10.100.12.10 port: 8250 >>> 2016-08-24 10:42:54,686 INFO [cloud.agent.Agent] (main:null) (logid:) >>> Attempted to connect to the server, but received an unexpected >>> exception, trying again.../ >>> >>> >>> Port is open on the management server, there is no firewall in between. >>> We found that there was a bug report here: >>> https://issues.apache.org/jira/browse/CLOUDSTACK-9348. There is a PR >>> changing the NIO Selector wait time: >>> https://github.com/apache/cloudstack/pull/1601 which was merged into >>> master branch. >>> >>> Since we installed Release 4.9 we probably need to patch the >>> NioConection.class as described in PR1601 , right? >>> >>> kvm03# unzip -v /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar | >>> grep NioConnection >>> 3923 Defl:N 1778 55% 2016-08-02 09:28 05aaf7d5 >>> com/cloud/utils/nio/NioConnection$1.class >>> 881 Defl:N 495 44% 2016-08-02 09:28 e378984c >>> com/cloud/utils/nio/NioConnection$ChangeRequest.class >>> 15410 Defl:N 7130 54% 2016-08-02 09:28 b3281f5a >>> com/cloud/utils/nio/NioConnection.class >>> 1134 Defl:N 584 49% 2016-08-02 09:28 8d5cb4a8 >>> com/cloud/utils/exception/NioConnectionException.class >>> >>> Due to a lack of java expertise we have some basic questions: >>> - Is there a patched jar file available ? public build server? >>> - Do we need to create the jar from sources ? procedure? >>> - How do we apply the patch ? >>> >>> many thanks! >>> fanfarlo >>> >>> >>> >>> >>> >>> >>> >>> >>> rohit.yadav@shapeblue.com >>> www.shapeblue.com >>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK >>> @shapeblue >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >> rohit.yadav@shapeblue.com >> www.shapeblue.com >> 53 Chandos Place, Covent Garden, London WC2N 4HSUK >> @shapeblue >> >> >> >> > --------------050908080306030309080506--