cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ilya <ilya.mailing.li...@gmail.com>
Subject Re: Mess after volume migration.
Date Tue, 09 Aug 2016 00:37:30 GMT
this happened to us on non XEN hypervisor as well.

CloudStack has a timeout for a long running jobs - which i assume in
your case - it has exceeded.

Changing volumes table should be enough by referencing proper pool_id.
Just make sure that data size matches on both ends.

consider changing
"copy.volume.wait" (if that does not help) also "vm.job.timeout"


Regards
ilya

On 8/8/16 3:54 AM, Makrand wrote:
> Guys,
> 
> My setup:- ACS 4.4.2. Hypervisor: XENserver 6.2.
> 
> I tried moving a volume in running VM from primary storage A to primary
> storage B (using GUI of cloudstack). Please note, primary storage A LUN
> (LUN7)is coming out of one storage box and  primary storage  B LUN (LUN14)
> is from another.
> 
> For VM1 with 250GB data volume (51 GB used space), I was able to move this
> volume without any glitch in about 26mins.
> 
> But for VM2 with 250Gb data volume (182 GB used space), the migration
>  continued for about ~110 mins and then failed with follwing exception in
> very end with message like:-
> 
> 2016-08-06 14:30:57,481 WARN  [c.c.h.x.r.CitrixResourceBase]
> (DirectAgent-192:ctx-5716ad6d) Task failed! Task record:
> uuid: 308a8326-2622-e4c5-2019-3beb
> 87b0d183
>            nameLabel: Async.VDI.pool_migrate
>      nameDescription:
>    allowedOperations: []
>    currentOperations: {}
>              created: Sat Aug 06 12:36:27 UTC 2016
>             finished: Sat Aug 06 14:30:32 UTC 2016
>               status: failure
>           residentOn: com.xensource.xenapi.Host@f242d3ca
>             progress: 1.0
>                 type: <none/>
>               result:
>            errorInfo: [SR_BACKEND_FAILURE_80, , Failed to mark VDI hidden
> [opterr=SR 96e879bf-93aa-47ca-e2d5-e595afbab294: error aborting existing
> process]]
>          otherConfig: {}
>            subtaskOf: com.xensource.xenapi.Task@aaf13f6f
>             subtasks: []
> 
> 
> So cloudstack just removed the JOB telling it failed, says the mangement
> server log.
> 
> A) But when I am checking it at hyeprvisor level, the volume is on new SR
> i.e. on LUN14. Strange huh? So now the new uuid for this volume from XE cli
> is like
> 
> [root@gcx-bom-compute1 ~]# xe vbd-list
> vm-uuid=3fcb3070-e373-3cf9-d0aa-0a657142a38d
> uuid ( RO)             : f15dc54a-3868-8de8-5427-314e341879c6
>           vm-uuid ( RO): 3fcb3070-e373-3cf9-d0aa-0a657142a38d
>     vm-name-label ( RO): i-22-803-VM
>          vdi-uuid ( RO): cc1f8e83-f224-44b7-9359-282a1c1e3db1
>             empty ( RO): false
>            device ( RO): hdb
> 
> B) But luckily I had the entry taken before migration  and it shows like:-
> 
> uuid ( RO) : f15dc54a-3868-8de8-5427-314e341879c6
> vm-uuid ( RO): 3fcb3070-e373-3cf9-d0aa-0a657142a38d
> vm-name-label ( RO): i-22-803-VM
> vdi-uuid ( RO): 7c073522-a077-41a0-b9a7-7b61847d413b
> empty ( RO): false
> device ( RO): hdb
> 
> C) Since this failed at cloudstack, the DB is still holding old value.
> Here is current volume table entry in DB
> 
> id: 1004
>>                 account_id: 22
>>                  domain_id: 15
>>                    pool_id: 18
>>               last_pool_id: NULL
>>                instance_id: 803
>>                  device_id: 1
>>                       name:
>> cloudx_globalcloudxchange_com_W2797T2808S3112_V1462960751
>>                       uuid: a8f01042-d0de-4496-98fa-a0b13648bef7
>>                       size: 268435456000
>>                     folder: NULL
>>                       path: 7c073522-a077-41a0-b9a7-7b61847d413b
>>                     pod_id: NULL
>>             data_center_id: 2
>>                 iscsi_name: NULL
>>                    host_ip: NULL
>>                volume_type: DATADISK
>>                  pool_type: NULL
>>           disk_offering_id: 6
>>                template_id: NULL
>> first_snapshot_backup_uuid: NULL
>>                recreatable: 0
>>                    created: 2016-05-11 09:59:12
>>                   attached: 2016-05-11 09:59:21
>>                    updated: 2016-08-06 14:30:57
>>                    removed: NULL
>>                      state: Ready
>>                 chain_info: NULL
>>               update_count: 42
>>                  disk_type: NULL
>>     vm_snapshot_chain_size: NULL
>>                     iso_id: NULL
>>             display_volume: 1
>>                     format: VHD
>>                   min_iops: NULL
>>                   max_iops: NULL
>>              hv_ss_reserve: 0
>> 1 row in set (0.00 sec)
>>
> 
> 
> So the path variable shows value as 7c073522-a077-41a0-b9a7-7b61847d413b
> and pool id as 18.
> 
> The VM is running as of now, but I am sure the moment I will reboot, this
> volume will be gone or worst VM won't boot. This is production VM BTW.
> 
> D) So I think I need to edit volume table for path and pool_id parameters
> and need to place new values in place and then reboot VM. Do I need to make
> any more changes in DB in some other tables for same? Any comment/help is
> much appreciated.
> 
> 
> 
> 
> --
> Best,
> Makrand
> 

Mime
View raw message