cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tutkowski, Mike" <Mike.Tutkow...@netapp.com>
Subject Re: Problem with CLOUDSTACK-10240 (Cannot migrate local volume to shared storage)
Date Mon, 16 Jul 2018 00:11:21 GMT
Hi Rafael,

Thanks for your time on this.

Here is an example where the new code deviates from the old code in a critical fashion (code
right below is new):

    private Map<Volume, StoragePool> getDefaultMappingOfVolumesAndStoragePoolForMigration(VirtualMachineProfile
profile, Host targetHost) {
        Map<Volume, StoragePool> volumeToPoolObjectMap = new HashMap<Volume, StoragePool>();
        List<VolumeVO> allVolumes = _volsDao.findUsableVolumesForInstance(profile.getId());
        for (VolumeVO volume : allVolumes) {
            StoragePoolVO currentPool = _storagePoolDao.findById(volume.getPoolId());
            if (ScopeType.HOST.equals(currentPool.getScope())) {
                createVolumeToStoragePoolMappingIfNeeded(profile, targetHost, volumeToPoolObjectMap,
volume, currentPool);
            } else {
                volumeToPoolObjectMap.put(volume, currentPool);
            }
        }
        return volumeToPoolObjectMap;
    }

What happens in the new code (above) is if the user didn’t pass in a storage pool to migrate
the virtual disk to (but the VM is being migrated to a new cluster), this code just assigns
the virtual disk to its current storage pool (which is not going to be visible to any of the
hosts in the new compute cluster).

In the old code (I’m looking at 4.11.3 here), you could look around line 2337 for the following
code (in the VirtualMachineManagerImpl.getPoolListForVolumesForMigration method):

                    // Find a suitable pool for the volume. Call the storage pool allocator
to find the list of pools.

                    final DiskProfile diskProfile = new DiskProfile(volume, diskOffering,
profile.getHypervisorType());
                    final DataCenterDeployment plan = new DataCenterDeployment(host.getDataCenterId(),
host.getPodId(), host.getClusterId(),
                            host.getId(), null, null);

                    final List<StoragePool> poolList = new ArrayList<>();
                    final ExcludeList avoid = new ExcludeList();

                    for (final StoragePoolAllocator allocator : _storagePoolAllocators) {
                        final List<StoragePool> poolListFromAllocator = allocator.allocateToPool(diskProfile,
profile, plan, avoid, StoragePoolAllocator.RETURN_UPTO_ALL);

                        if (poolListFromAllocator != null && !poolListFromAllocator.isEmpty())
{
                            poolList.addAll(poolListFromAllocator);
                        }
                    }

This old code would find an applicable storage pool in the destination cluster (one that can
be seen by the hosts in that compute cluster).

I think the main error in the new logic is the assumption that a VM can only be migrated to
a host in the same computer cluster. For XenServer (perhaps for other hypervisor types?),
we support cross-cluster VM migration.

The other issue I noticed is that there is no logic in the new code that checks for managed-storage
use cases. If you look in the VirtualMachineManagerImpl.getPoolListForVolumesForMigration
method in the old code, there is special handling for managed storage. I don’t see this
reproduced in the new logic.

I sympathize with your point that all tests passed yet this issue was not uncovered. Unfortunately,
I suspect we have a fairly low % coverage of automated tests on CloudStack. If we ever did
get to a high % of automated test coverage, we might be able to spin up new releases more
frequently. As the case stands today, however, there are probably many un-tested use cases
when it comes to our automated suite of tests.

Thanks again!
Mike

On 7/15/18, 4:19 PM, "Rafael Weingärtner" <rafaelweingartner@gmail.com> wrote:

    Mike, are you able to pin-point in the old/replaced code the bit that was
    handling your use case?  I took the most care not to break anything.
    Also, your test case, isn't it in the ACS' integration test suite? In
    theory, all test passed when we merged the PR.
    
    I sure can take a look at it. Can you detail your use case? I mean, the
    high level execution flow. What API methods you do, what you expected to
    happen, and what is happening today.
    
    On Sun, Jul 15, 2018 at 3:25 AM, Tutkowski, Mike <Mike.Tutkowski@netapp.com>
    wrote:
    
    > It looks like this is the problematic PR:
    >
    > https://github.com/apache/cloudstack/pull/2425/
    >
    > On 7/15/18, 12:20 AM, "Tutkowski, Mike" <Mike.Tutkowski@netapp.com> wrote:
    >
    >     Hi,
    >
    >     While running managed-storage regression tests tonight, I noticed a
    > problem that is not related to managed storage.
    >
    >     CLOUDSTACK-10240 is a ticket asking that we allow the migration of a
    > virtual disk that’s on local storage to shared storage. In the process of
    > enabling this feature, the VirtualMachineManagerImpl.
    > getPoolListForVolumesForMigration method was re-written in a way that
    > completely breaks at least one use case: Migrating a VM across compute
    > clusters (at least supported in XenServer). If, say, a virtual disk resides
    > on shared storage in the source compute cluster, we must be able to copy
    > this virtual disk to shared storage in the destination compute cluster.
    >
    >     As the code is currently written, this is no longer possible. It also
    > seems that the managed-storage logic has been dropped for some reason in
    > the new implementation.
    >
    >     Rafael – It seems that you worked on this feature. Would you be able
    > to look into this and create a PR?
    >
    >     Thanks,
    >     Mike
    >
    >
    >
    
    
    -- 
    Rafael Weingärtner
    

Mime
View raw message