Return-Path: X-Original-To: apmail-cloudstack-issues-archive@www.apache.org Delivered-To: apmail-cloudstack-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A4D6E10922 for ; Fri, 3 Jan 2014 07:41:58 +0000 (UTC) Received: (qmail 74880 invoked by uid 500); 3 Jan 2014 07:41:56 -0000 Delivered-To: apmail-cloudstack-issues-archive@cloudstack.apache.org Received: (qmail 74861 invoked by uid 500); 3 Jan 2014 07:41:54 -0000 Mailing-List: contact issues-help@cloudstack.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cloudstack.apache.org Delivered-To: mailing list issues@cloudstack.apache.org Received: (qmail 74847 invoked by uid 500); 3 Jan 2014 07:41:53 -0000 Delivered-To: apmail-incubator-cloudstack-issues@incubator.apache.org Received: (qmail 74839 invoked by uid 99); 3 Jan 2014 07:41:52 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Jan 2014 07:41:52 +0000 Date: Fri, 3 Jan 2014 07:41:51 +0000 (UTC) From: "Likitha Shetty (JIRA)" To: cloudstack-issues@incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Resolved] (CLOUDSTACK-5504) Vmware-Primary store unavailable for 10 mts - All snapshot tasks reported failure because of timing out after 20 minutes.But the snapshot process continues to succeed in Vmcenter after NFS was brought up. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CLOUDSTACK-5504?page=3Dcom.atl= assian.jira.plugin.system.issuetabpanels:all-tabpanel ] Likitha Shetty resolved CLOUDSTACK-5504. ---------------------------------------- Resolution: Won't Fix The behavior is an expected behavior caused by our time-out design. By design during the snapshot backup process since we lose connection to SS= VM, there is no way to cancel the exportVm task in vCenter from CS. And if we try to manually delete the snapshot once the primary storage is b= rought back up, we are not going to be able to clean it up from the seconda= ry storage because CS is not aware of the path where the snapshot was backe= d up (since the copy async call failed before returning with a timeout erro= r). > Vmware-Primary store unavailable for 10 mts - All snapshot tasks reported= failure because of timing out after 20 minutes.But the snapshot process co= ntinues to succeed in Vmcenter after NFS was brought up. > -------------------------------------------------------------------------= ---------------------------------------------------------------------------= -------------------------------------------------------- > > Key: CLOUDSTACK-5504 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-550= 4 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the defa= ult.)=20 > Components: Management Server > Affects Versions: 4.3.0 > Environment: Build from 4.3 > Reporter: Sangeetha Hariharan > Assignee: Likitha Shetty > Priority: Critical > Fix For: 4.3.0 > > Attachments: primarydown.rar > > > Setup: > Advanced zone set up with 2 5.1 ESXI host. > 1. Deploy few Vms in each of the hosts , so we start with 11 Vms. > 2. Create snapshot for ROOT volumes. > 3. When snapshot is still in progress , Make the primary storage unavaila= ble for 10 mts. > 4. Bring up the primary store after more than 10 mts. > When the primary store was brought up , I see the snapshots that were in = progress actually continue to download to secondary and succeed .=20 > one of the snapshots that succeeded and fully available in secondary stor= e: > root@Rack3Host8 1c545037-1d1c-4927-918a-2f3975e1076b]# ls -ltr > total 446808 > -rw-r--r--. 1 root root 6454 Dec 13 21:09 1c545037-1d1c-4927-918a-2f= 3975e1076b.ovf > -rw-r--r--. 1 root root 457069056 Dec 13 21:09 1c545037-1d1c-4927-918a-2f= 3975e1076b-disk0.vmdk > [root@Rack3Host8 1c545037-1d1c-4927-918a-2f3975e1076b]# > But all the 11 snapshot tasks from Cloud Stack side report failure after = about 20 minutes and then snapshots are put in "CreatedOnPrimary" state. > Next scheduled hourly snapshot is attempted and succeeds. > | 22 | CreatedOnPrimary | 2013-12-13 21:52:15 | NULL = | > | 21 | CreatedOnPrimary | 2013-12-13 21:52:15 | NULL = | > | 20 | CreatedOnPrimary | 2013-12-13 21:52:15 | NULL = | > | 19 | CreatedOnPrimary | 2013-12-13 21:52:15 | NULL = | > | 18 | CreatedOnPrimary | 2013-12-13 21:52:16 | NULL = | > | 17 | CreatedOnPrimary | 2013-12-13 21:52:16 | NULL = | > | 16 | CreatedOnPrimary | 2013-12-13 21:52:16 | NULL = | > | 14 | CreatedOnPrimary | 2013-12-13 21:52:17 | NULL = | > | 25 | CreatedOnPrimary | 2013-12-13 21:52:17 | NULL = | > | 24 | CreatedOnPrimary | 2013-12-13 21:52:17 | NULL = | > | 23 | CreatedOnPrimary | 2013-12-13 21:52:18 | NULL = | > | 22 | BackedUp | 2013-12-13 22:42:15 | NULL = | > | 21 | BackedUp | 2013-12-13 22:42:15 | NULL = | > | 20 | BackedUp | 2013-12-13 22:42:15 | NULL = | > | 19 | BackedUp | 2013-12-13 22:42:15 | NULL = | > | 18 | BackedUp | 2013-12-13 22:42:15 | NULL = | > | 17 | BackedUp | 2013-12-13 22:42:16 | NULL = | > | 16 | BackedUp | 2013-12-13 22:42:16 | NULL = | > | 14 | BackedUp | 2013-12-13 22:42:16 | NULL = | > | 25 | BackedUp | 2013-12-13 22:42:17 | NULL = | > | 24 | BackedUp | 2013-12-13 22:42:17 | NULL = | > | 23 | BackedUp | 2013-12-13 22:42:17 | NULL = | > +-----------+------------------+---------------------+-------------------= --+ > 86 rows in set (0.00 sec) > 2013-12-13 16:52:17,720 DEBUG [c.c.a.t.Request] (Job-Executor-5:ctx-1d3bd= 6cc ctx-837a59a5) Seq 5-422576170: Sending { Cmd , MgmtId: 95307354844397,= via: 5(s-10-VM), Ver: v1, Flags: 100011, [{"org.apache.cloudstack.storage.= command.CopyCommand":{"srcTO":{"org.apache.cloudstack.storage.to.SnapshotOb= jectTO":{"path":"ffa0b125-d1d9-4524-bd9e-03178914845b","volume":{"uuid":"15= 189035-3592-41ac-b2bc-a39d247e7d2f","volumeType":"ROOT","dataStore":{"org.a= pache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343= -bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57= .195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesy= stem://10.223.57.195//export/home/vmware/primary/?ROLE=3DPrimary&STOREUUID= =3Da0c555cc-695c-3343-bfa0-3413a91dbfed"}},"name":"ROOT-18","size":21474836= 48,"path":"ROOT-18","volumeId":18,"vmName":"i-4-18-VM","accountId":4,"chain= Info":"{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":[\"[a0c555cc695c3343= bfa03413a91dbfed] i-4-18-VM/ROOT-18.vmdk\"]}","format":"OVA","id":18,"devic= eId":0,"hypervisorType":"VMware"},"parentSnapshotPath":"0476f8f4-61b4-40d2-= 8e04-001a119fc294","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDa= taStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType"= :"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/pri= mary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmw= are/primary/?ROLE=3DPrimary&STOREUUID=3Da0c555cc-695c-3343-bfa0-3413a91dbfe= d"}},"vmName":"i-4-18-VM","name":"TestVM-tiny-host-0ps-0-2_ROOT-18_20131213= 215216","hypervisorType":"VMware","id":69,"quiescevm":false}},"destTO":{"or= g.apache.cloudstack.storage.to.SnapshotObjectTO":{"path":"snapshots/4/18","= volume":{"uuid":"15189035-3592-41ac-b2bc-a39d247e7d2f","volumeType":"ROOT",= "dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":= "a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem= ","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"= url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE= =3DPrimary&STOREUUID=3Da0c555cc-695c-3343-bfa0-3413a91dbfed"}},"name":"ROOT= -18","size":2147483648,"path":"ROOT-18","volumeId":18,"vmName":"i-4-18-VM",= "accountId":4,"chainInfo":"{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":= [\"[a0c555cc695c3343bfa03413a91dbfed] i-4-18-VM/ROOT-18.vmdk\"]}","format":= "OVA","id":18,"deviceId":0,"hypervisorType":"VMware"},"parentSnapshotPath":= "snapshots/4/18/730b8477-d29f-45e8-b2a2-fb26e2ac220b/730b8477-d29f-45e8-b2a= 2-fb26e2ac220b","dataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://= 10.223.57.194/export/home/vmware/secondary","_role":"Image"}},"vmName":"i-4= -18-VM","name":"TestVM-tiny-host-0ps-0-2_ROOT-18_20131213215216","hyperviso= rType":"VMware","id":69,"quiescevm":false}},"executeInSequence":false,"wait= ":21600}}] } > 2013-12-13 17:13:44,250 DEBUG [c.c.a.m.AgentAttache] (AgentConnectTaskPoo= l-3:ctx-1a3a5155) Seq 5-422576 > 170: Sending disconnect to class com.cloud.agent.manager.SynchronousListe= ner > 2013-12-13 17:13:44,252 DEBUG [c.c.a.m.AgentAttache] (Job-Executor-5:ctx-= 1d3bd6cc ctx-837a59a5) Seq 5-422576170: Waiting some more time because this= is the current command > 2013-12-13 17:13:44,252 DEBUG [c.c.a.m.AgentAttache] (Job-Executor-5:ctx-= 1d3bd6cc ctx-837a59a5) Seq 5-422576170: Waiting some more time because this= is the current command > 2013-12-13 17:13:44,252 INFO [c.c.u.e.CSExceptionErrorCode] (Job-Executo= r-5:ctx-1d3bd6cc ctx-837a59a5) Could not find exception: com.cloud.exceptio= n.OperationTimedoutException in error code list for exceptions > 2013-12-13 17:13:44,256 WARN [c.c.a.m.AgentAttache] (Job-Executor-5:ctx-= 1d3bd6cc ctx-837a59a5) Seq 5-422576170: Timed out on Seq 5-422576170: { Cm= d , MgmtId: 95307354844397, via: 5(s-10-VM), Ver: v1, Flags: 100011, [{"org= .apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.cloud= stack.storage.to.SnapshotObjectTO":{"path":"ffa0b125-d1d9-4524-bd9e-0317891= 4845b","volume":{"uuid":"15189035-3592-41ac-b2bc-a39d247e7d2f","volumeType"= :"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":= {"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFi= lesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port= ":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary= /?ROLE=3DPrimary&STOREUUID=3Da0c555cc-695c-3343-bfa0-3413a91dbfed"}},"name"= :"ROOT-18","size":2147483648,"path":"ROOT-18","volumeId":18,"vmName":"i-4-1= 8-VM","accountId":4,"chainInfo":"{\"diskDeviceBusName\":\"ide0:1\",\"diskCh= ain\":[\"[a0c555cc695c3343bfa03413a91dbfed] i-4-18-VM/ROOT-18.vmdk\"]}","fo= rmat":"OVA","id":18,"deviceId":0,"hypervisorType":"VMware"},"parentSnapshot= Path":"0476f8f4-61b4-40d2-8e04-001a119fc294","dataStore":{"org.apache.cloud= stack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a= 91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path= ":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.2= 23.57.195//export/home/vmware/primary/?ROLE=3DPrimary&STOREUUID=3Da0c555cc-= 695c-3343-bfa0-3413a91dbfed"}},"vmName":"i-4-18-VM","name":"TestVM-tiny-hos= t-0ps-0-2_ROOT-18_20131213215216","hypervisorType":"VMware","id":69,"quiesc= evm":false}},"destTO":{"org.apache.cloudstack.storage.to.SnapshotObjectTO":= {"path":"snapshots/4/18","volume":{"uuid":"15189035-3592-41ac-b2bc-a39d247e= 7d2f","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.Pr= imaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"po= olType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmw= are/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/h= ome/vmware/primary/?ROLE=3DPrimary&STOREUUID=3Da0c555cc-695c-3343-bfa0-3413= a91dbfed"}},"name":"ROOT-18","size":2147483648,"path":"ROOT-18","volumeId":= 18,"vmName":"i-4-18-VM","accountId":4,"chainInfo":"{\"diskDeviceBusName\":\= "ide0:1\",\"diskChain\":[\"[a0c555cc695c3343bfa03413a91dbfed] i-4-18-VM/ROO= T-18.vmdk\"]}","format":"OVA","id":18,"deviceId":0,"hypervisorType":"VMware= "},"parentSnapshotPath":"snapshots/4/18/730b8477-d29f-45e8-b2a2-fb26e2ac220= b/730b8477-d29f-45e8-b2a2-fb26e2ac220b","dataStore":{"com.cloud.agent.api.t= o.NfsTO":{"_url":"nfs://10.223.57.194/export/home/vmware/secondary","_role"= :"Image"}},"vmName":"i-4-18-VM","name":"TestVM-tiny-host-0ps-0-2_ROOT-18_20= 131213215216","hypervisorType":"VMware","id":69,"quiescevm":false}},"execut= eInSequence":false,"wait":21600}}] } > 2013-12-13 17:13:44,294 DEBUG [c.c.a.m.AgentAttache] (Job-Executor-5:ctx-= 1d3bd6cc ctx-837a59a5) Seq 5-422576170: Cancelling. > 2013-12-13 17:13:44,305 DEBUG [o.a.c.s.RemoteHostEndPoint] (Job-Executor-= 5:ctx-1d3bd6cc ctx-837a59a5) Failed to send command, due to Agent:2, com.cl= oud.exception.OperationTimedoutException: Commands 422576170 to Host 5 time= d out after 43200 > 2013-12-13 17:13:44,305 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-= Executor-5:ctx-1d3bd6cc ctx-837a59a5) copy snasphot failed: com.cloud.utils= .exception.CloudRuntimeException: Failed to send command, due to Agent:2, c= om.cloud.exception.OperationTimedoutException: Commands 422576170 to Host 5= timed out after 43200 > 2013-12-13 17:13:44,305 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-= Executor-5:ctx-1d3bd6cc ctx-837a59a5) copy failed > 2013-12-13 17:13:44,440 DEBUG [c.c.s.s.SnapshotManagerImpl] (Job-Executor= -5:ctx-1d3bd6cc ctx-837a59a5) Failed to create snapshot > com.cloud.utils.exception.CloudRuntimeException: com.cloud.utils.exceptio= n.CloudRuntimeException: com.cloud.utils.exception.CloudRuntimeException: F= ailed to send command, due to Agent:2, com.cloud.exception.OperationTimedou= tException: Commands 422576170 to Host 5 timed out after 43200 > at org.apache.cloudstack.storage.snapshot.SnapshotServiceImpl.bac= kupSnapshot(SnapshotServiceImpl.java:275) > at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrate= gy.backupSnapshot(XenserverSnapshotStrategy.java:135) > at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrate= gy.takeSnapshot(XenserverSnapshotStrategy.java:294) > at com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot(Sn= apshotManagerImpl.java:951) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccess= orImpl.java:57) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMeth= odAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:601) > at org.springframework.aop.support.AopUtils.invokeJoinpointUsingR= eflection(AopUtils.java:317) > at org.springframework.aop.framework.ReflectiveMethodInvocation.i= nvokeJoinpoint(ReflectiveMethodInvocation.java:183) > at org.springframework.aop.framework.ReflectiveMethodInvocation.p= roceed(ReflectiveMethodInvocation.java:150) > at org.springframework.aop.interceptor.ExposeInvocationIntercepto= r.invoke(ExposeInvocationInterceptor.java:91) > at org.springframework.aop.framework.ReflectiveMethodInvocation.p= roceed(ReflectiveMethodInvocation.java:172) > at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(Jd= kDynamicAopProxy.java:204) > at $Proxy161.takeSnapshot(Unknown Source) > at org.apache.cloudstack.storage.volume.VolumeServiceImpl.takeSna= pshot(VolumeServiceImpl.java:1341) > at com.cloud.storage.VolumeApiServiceImpl.takeSnapshot(VolumeApiS= erviceImpl.java:1486) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccess= orImpl.java:57) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMeth= odAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:601) > at org.springframework.aop.support.AopUtils.invokeJoinpointUsingR= eflection(AopUtils.java:317) > at org.springframework.aop.framework.ReflectiveMethodInvocation.i= nvokeJoinpoint(ReflectiveMethodInvocation.java:183) > at org.springframework.aop.framework.ReflectiveMethodInvocation.p= roceed(ReflectiveMethodInvocation.java:150) > at org.springframework.aop.interceptor.ExposeInvocationIntercepto= r.invoke(ExposeInvocationInterceptor.java:91) -- This message was sent by Atlassian JIRA (v6.1.5#6160)