cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Likitha Shetty (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (CLOUDSTACK-5504) Vmware-Primary store unavailable for 10 mts - All snapshot tasks reported failure because of timing out after 20 minutes.But the snapshot process continues to succeed in Vmcenter after NFS was brought up.
Date Fri, 03 Jan 2014 07:41:51 GMT

     [ https://issues.apache.org/jira/browse/CLOUDSTACK-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Likitha Shetty resolved CLOUDSTACK-5504.
----------------------------------------

    Resolution: Won't Fix

The behavior is an expected behavior caused by our time-out design.
By design during the snapshot backup process since we lose connection to SSVM, there is no
way to cancel the exportVm task in vCenter from CS.
And if we try to manually delete the snapshot once the primary storage is brought back up,
we are not going to be able to clean it up from the secondary storage because CS is not aware
of the path where the snapshot was backed up (since the copy async call failed before returning
with a timeout error).


> Vmware-Primary store unavailable for 10 mts - All snapshot tasks reported failure because
of timing out after 20 minutes.But the snapshot process continues to succeed in Vmcenter after
NFS was brought up.
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-5504
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5504
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Management Server
>    Affects Versions: 4.3.0
>         Environment: Build from 4.3
>            Reporter: Sangeetha Hariharan
>            Assignee: Likitha Shetty
>            Priority: Critical
>             Fix For: 4.3.0
>
>         Attachments: primarydown.rar
>
>
> Setup:
> Advanced zone set up with 2 5.1 ESXI host.
> 1. Deploy few Vms in each of the hosts  , so we start with 11 Vms.
> 2. Create snapshot for ROOT volumes.
> 3. When snapshot is still in progress , Make the primary storage unavailable for 10 mts.
> 4. Bring up the primary store after more than 10 mts.
> When the primary store was brought up , I see the snapshots that were in progress actually
continue to download to secondary and succeed . 
> one of the snapshots that succeeded and fully available in secondary store:
> root@Rack3Host8 1c545037-1d1c-4927-918a-2f3975e1076b]# ls -ltr
> total 446808
> -rw-r--r--. 1 root root      6454 Dec 13 21:09 1c545037-1d1c-4927-918a-2f3975e1076b.ovf
> -rw-r--r--. 1 root root 457069056 Dec 13 21:09 1c545037-1d1c-4927-918a-2f3975e1076b-disk0.vmdk
> [root@Rack3Host8 1c545037-1d1c-4927-918a-2f3975e1076b]#
> But all the 11 snapshot tasks from Cloud Stack side report failure after about 20 minutes
and then snapshots are put in "CreatedOnPrimary" state.
> Next scheduled hourly snapshot is attempted and succeeds.
> |        22 | CreatedOnPrimary | 2013-12-13 21:52:15 | NULL                |
> |        21 | CreatedOnPrimary | 2013-12-13 21:52:15 | NULL                |
> |        20 | CreatedOnPrimary | 2013-12-13 21:52:15 | NULL                |
> |        19 | CreatedOnPrimary | 2013-12-13 21:52:15 | NULL                |
> |        18 | CreatedOnPrimary | 2013-12-13 21:52:16 | NULL                |
> |        17 | CreatedOnPrimary | 2013-12-13 21:52:16 | NULL                |
> |        16 | CreatedOnPrimary | 2013-12-13 21:52:16 | NULL                |
> |        14 | CreatedOnPrimary | 2013-12-13 21:52:17 | NULL                |
> |        25 | CreatedOnPrimary | 2013-12-13 21:52:17 | NULL                |
> |        24 | CreatedOnPrimary | 2013-12-13 21:52:17 | NULL                |
> |        23 | CreatedOnPrimary | 2013-12-13 21:52:18 | NULL                |
> |        22 | BackedUp         | 2013-12-13 22:42:15 | NULL                |
> |        21 | BackedUp         | 2013-12-13 22:42:15 | NULL                |
> |        20 | BackedUp         | 2013-12-13 22:42:15 | NULL                |
> |        19 | BackedUp         | 2013-12-13 22:42:15 | NULL                |
> |        18 | BackedUp         | 2013-12-13 22:42:15 | NULL                |
> |        17 | BackedUp         | 2013-12-13 22:42:16 | NULL                |
> |        16 | BackedUp         | 2013-12-13 22:42:16 | NULL                |
> |        14 | BackedUp         | 2013-12-13 22:42:16 | NULL                |
> |        25 | BackedUp         | 2013-12-13 22:42:17 | NULL                |
> |        24 | BackedUp         | 2013-12-13 22:42:17 | NULL                |
> |        23 | BackedUp         | 2013-12-13 22:42:17 | NULL                |
> +-----------+------------------+---------------------+---------------------+
> 86 rows in set (0.00 sec)
> 2013-12-13 16:52:17,720 DEBUG [c.c.a.t.Request] (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5)
Seq 5-422576170: Sending  { Cmd , MgmtId: 95307354844397, via: 5(s-10-VM), Ver: v1, Flags:
100011, [{"org.apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.cloudstack.storage.to.SnapshotObjectTO":{"path":"ffa0b125-d1d9-4524-bd9e-03178914845b","volume":{"uuid":"15189035-3592-41ac-b2bc-a39d247e7d2f","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"name":"ROOT-18","size":2147483648,"path":"ROOT-18","volumeId":18,"vmName":"i-4-18-VM","accountId":4,"chainInfo":"{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":[\"[a0c555cc695c3343bfa03413a91dbfed]
i-4-18-VM/ROOT-18.vmdk\"]}","format":"OVA","id":18,"deviceId":0,"hypervisorType":"VMware"},"parentSnapshotPath":"0476f8f4-61b4-40d2-8e04-001a119fc294","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"vmName":"i-4-18-VM","name":"TestVM-tiny-host-0ps-0-2_ROOT-18_20131213215216","hypervisorType":"VMware","id":69,"quiescevm":false}},"destTO":{"org.apache.cloudstack.storage.to.SnapshotObjectTO":{"path":"snapshots/4/18","volume":{"uuid":"15189035-3592-41ac-b2bc-a39d247e7d2f","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"name":"ROOT-18","size":2147483648,"path":"ROOT-18","volumeId":18,"vmName":"i-4-18-VM","accountId":4,"chainInfo":"{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":[\"[a0c555cc695c3343bfa03413a91dbfed]
i-4-18-VM/ROOT-18.vmdk\"]}","format":"OVA","id":18,"deviceId":0,"hypervisorType":"VMware"},"parentSnapshotPath":"snapshots/4/18/730b8477-d29f-45e8-b2a2-fb26e2ac220b/730b8477-d29f-45e8-b2a2-fb26e2ac220b","dataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://10.223.57.194/export/home/vmware/secondary","_role":"Image"}},"vmName":"i-4-18-VM","name":"TestVM-tiny-host-0ps-0-2_ROOT-18_20131213215216","hypervisorType":"VMware","id":69,"quiescevm":false}},"executeInSequence":false,"wait":21600}}]
}
> 2013-12-13 17:13:44,250 DEBUG [c.c.a.m.AgentAttache] (AgentConnectTaskPool-3:ctx-1a3a5155)
Seq 5-422576
> 170: Sending disconnect to class com.cloud.agent.manager.SynchronousListener
> 2013-12-13 17:13:44,252 DEBUG [c.c.a.m.AgentAttache] (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5)
Seq 5-422576170: Waiting some more time because this is the current command
> 2013-12-13 17:13:44,252 DEBUG [c.c.a.m.AgentAttache] (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5)
Seq 5-422576170: Waiting some more time because this is the current command
> 2013-12-13 17:13:44,252 INFO  [c.c.u.e.CSExceptionErrorCode] (Job-Executor-5:ctx-1d3bd6cc
ctx-837a59a5) Could not find exception: com.cloud.exception.OperationTimedoutException in
error code list for exceptions
> 2013-12-13 17:13:44,256 WARN  [c.c.a.m.AgentAttache] (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5)
Seq 5-422576170: Timed out on Seq 5-422576170:  { Cmd , MgmtId: 95307354844397, via: 5(s-10-VM),
Ver: v1, Flags: 100011, [{"org.apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.cloudstack.storage.to.SnapshotObjectTO":{"path":"ffa0b125-d1d9-4524-bd9e-03178914845b","volume":{"uuid":"15189035-3592-41ac-b2bc-a39d247e7d2f","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"name":"ROOT-18","size":2147483648,"path":"ROOT-18","volumeId":18,"vmName":"i-4-18-VM","accountId":4,"chainInfo":"{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":[\"[a0c555cc695c3343bfa03413a91dbfed]
i-4-18-VM/ROOT-18.vmdk\"]}","format":"OVA","id":18,"deviceId":0,"hypervisorType":"VMware"},"parentSnapshotPath":"0476f8f4-61b4-40d2-8e04-001a119fc294","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"vmName":"i-4-18-VM","name":"TestVM-tiny-host-0ps-0-2_ROOT-18_20131213215216","hypervisorType":"VMware","id":69,"quiescevm":false}},"destTO":{"org.apache.cloudstack.storage.to.SnapshotObjectTO":{"path":"snapshots/4/18","volume":{"uuid":"15189035-3592-41ac-b2bc-a39d247e7d2f","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"name":"ROOT-18","size":2147483648,"path":"ROOT-18","volumeId":18,"vmName":"i-4-18-VM","accountId":4,"chainInfo":"{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":[\"[a0c555cc695c3343bfa03413a91dbfed]
i-4-18-VM/ROOT-18.vmdk\"]}","format":"OVA","id":18,"deviceId":0,"hypervisorType":"VMware"},"parentSnapshotPath":"snapshots/4/18/730b8477-d29f-45e8-b2a2-fb26e2ac220b/730b8477-d29f-45e8-b2a2-fb26e2ac220b","dataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://10.223.57.194/export/home/vmware/secondary","_role":"Image"}},"vmName":"i-4-18-VM","name":"TestVM-tiny-host-0ps-0-2_ROOT-18_20131213215216","hypervisorType":"VMware","id":69,"quiescevm":false}},"executeInSequence":false,"wait":21600}}]
}
> 2013-12-13 17:13:44,294 DEBUG [c.c.a.m.AgentAttache] (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5)
Seq 5-422576170: Cancelling.
> 2013-12-13 17:13:44,305 DEBUG [o.a.c.s.RemoteHostEndPoint] (Job-Executor-5:ctx-1d3bd6cc
ctx-837a59a5) Failed to send command, due to Agent:2, com.cloud.exception.OperationTimedoutException:
Commands 422576170 to Host 5 timed out after 43200
> 2013-12-13 17:13:44,305 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-5:ctx-1d3bd6cc
ctx-837a59a5) copy snasphot failed: com.cloud.utils.exception.CloudRuntimeException: Failed
to send command, due to Agent:2, com.cloud.exception.OperationTimedoutException: Commands
422576170 to Host 5 timed out after 43200
> 2013-12-13 17:13:44,305 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-5:ctx-1d3bd6cc
ctx-837a59a5) copy failed
> 2013-12-13 17:13:44,440 DEBUG [c.c.s.s.SnapshotManagerImpl] (Job-Executor-5:ctx-1d3bd6cc
ctx-837a59a5) Failed to create snapshot
> com.cloud.utils.exception.CloudRuntimeException: com.cloud.utils.exception.CloudRuntimeException:
com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:2, com.cloud.exception.OperationTimedoutException:
Commands 422576170 to Host 5 timed out after 43200
>         at org.apache.cloudstack.storage.snapshot.SnapshotServiceImpl.backupSnapshot(SnapshotServiceImpl.java:275)
>         at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.backupSnapshot(XenserverSnapshotStrategy.java:135)
>         at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.takeSnapshot(XenserverSnapshotStrategy.java:294)
>         at com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot(SnapshotManagerImpl.java:951)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:601)
>         at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
>         at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
>         at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
>         at $Proxy161.takeSnapshot(Unknown Source)
>         at org.apache.cloudstack.storage.volume.VolumeServiceImpl.takeSnapshot(VolumeServiceImpl.java:1341)
>         at com.cloud.storage.VolumeApiServiceImpl.takeSnapshot(VolumeApiServiceImpl.java:1486)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:601)
>         at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
>         at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message