cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Tutkowski <mike.tutkow...@solidfire.com>
Subject Re: Your thoughts on using Primary Storage for keeping snapshots
Date Mon, 16 Feb 2015 16:46:24 GMT
I should have also commented on KVM (since that was the hypervisor called
out in the initial e-mail).

In my situation, most of my customers use XenServer and/or ESXi, so KVM has
received the fewest of my cycles with regards to those three hypervisors.

KVM, though, is actually the simplest hypervisor for which to implement
these changes (since I am using the iSCSI adapter of the KVM agent and it
just essentially passes my LUN to the VM in question).

For KVM, there is no clustered file system applied to my backend LUN, so I
don't have to "worry" about that layer.

I don't see any hurdles like *immutable* UUIDs of SRs and VDIs (such is the
case with XenServer) or having to re-signature anything (such is the case
with ESXi).

On Mon, Feb 16, 2015 at 9:33 AM, Mike Tutkowski <
mike.tutkowski@solidfire.com> wrote:

> I have been working on this on and off for a while now (as time permits).
>
> Here is an e-mail I sent to a customer of ours that helps describe some of
> the issues:
>
> *** Beginning of e-mail ***
>
> The main requests were around the following features:
>
> * The ability to leverage SolidFire snapshots.
>
> * The ability to create CloudStack templates from SolidFire snapshots.
>
> I had these on my roadmap, but bumped the priority up and began work on
> them for the CS 4.6 release.
>
> During design, I realized there were issues with the way XenServer is
> architected that prevented me from directly using SolidFire snapshots.
>
> I could definitely take a SolidFire snapshot of a SolidFire volume, but
> this snapshot would not be usable from XenServer if the original volume was
> still in use.
>
> Here is the gist of the problem:
>
> When XenServer leverages an iSCSI target such as a SolidFire volume, it
> applies a clustered files system to it, which they call a storage
> repository (SR). An SR has an *immutable* UUID associated with it.
>
> The virtual volume (which a VM sees as a disk) is represented by a virtual
> disk image (VDI) in the SR. A VDI also has an *immutable* UUID associated
> with it.
>
> If I take a snapshot (or a clone) of the SolidFire volume and then later
> try to use that snapshot from XenServer, XenServer complains that the SR on
> the snapshot has a UUID that conflicts with an existing UUID.
>
> In other words, it is not possible to use the original SR and the snapshot
> of this SR from XenServer at the same time, which is critical in a cloud
> environment (to enable creating templates from snapshots).
>
> The way I have proposed circumventing this issue is not ideal, but
> technically works (this code is checked into the CS 4.6 branch):
>
> When the time comes to take a CloudStack snapshot of a CloudStack volume
> that is backed by SolidFire storage via the storage plug-in, the plug-in
> will create a new SolidFire volume with characteristics (size and IOPS)
> equal to those of the original volume.
>
> We then have XenServer attach to this new SolidFire volume, create a *new*
> SR on it, and then copy the VDI from the source SR to the destination SR
> (the new SR).
>
> This leads to us having a copy of the VDI (a "snapshot" of sorts), but it
> requires CPU cycles on the compute cluster as well as network bandwidth to
> write to the SAN (thus it is slower and more resource intensive than a
> SolidFire snapshot).
>
> I spoke with Tim Mackey (who works on XenServer at Citrix) concerning this
> issue before and during the CloudStack Collaboration Conference in Budapest
> in November. He agreed that this is a legitimate issue with the way
> XenServer is designed and could not think of a way (other than what I was
> doing) to get around it in current versions of XenServer.
>
> One thought is to have a feature added to XenServer that enables you to
> change the UUID of an SR and of a VDI.
>
> If I could do that, then I could take a SolidFire snapshot of the
> SolidFire volume and issue commands to XenServer to have it change the
> UUIDs of the original SR and the original VDI. I could then recored the
> necessary UUID info in the CS DB.
>
> *** End of e-mail ***
>
> I have since investigated this on ESXi.
>
> ESXi does have a way for us to "re-signature" a datastore, so backend
> snapshots can be taken and effectively used on this hypervisor.
>
> On Mon, Feb 16, 2015 at 8:19 AM, Logan Barfield <lbarfield@tqhosting.com>
> wrote:
>
>> I'm just going to stick with the qemu-img option change for RBD for
>> now (which should cut snapshot time down drastically), and look
>> forward to this in the future.  I'd be happy to help get this moving,
>> but I'm not enough of a developer to lead the charge.
>>
>> As far as renaming goes, I agree that maybe backups isn't the right
>> word.  That being said calling a full-sized copy of a volume a
>> "snapshot" also isn't the right word.  Maybe "image" would be better?
>>
>> I've also got my reservations about "accounts" vs "users" (I think
>> "departments" and "accounts or users" respectively is less confusing),
>> but that's a different thread.
>>
>> Thank You,
>>
>> Logan Barfield
>> Tranquil Hosting
>>
>>
>> On Mon, Feb 16, 2015 at 10:04 AM, Wido den Hollander <wido@widodh.nl>
>> wrote:
>> >
>> >
>> > On 16-02-15 15:38, Logan Barfield wrote:
>> >> I like this idea a lot for Ceph RBD.  I do think there should still be
>> >> support for copying snapshots to secondary storage as needed (for
>> >> transfers between zones, etc.).  I really think that this could be
>> >> part of a larger move to clarify the naming conventions used for disk
>> >> operations.  Currently "Volume Snapshots" should probably really be
>> >> called "Backups".  So having "snapshot" functionality, and a "convert
>> >> snapshot to backup/template" would be a good move.
>> >>
>> >
>> > I fully agree that this would be a very great addition.
>> >
>> > I won't be able to work on this any time soon though.
>> >
>> > Wido
>> >
>> >> Thank You,
>> >>
>> >> Logan Barfield
>> >> Tranquil Hosting
>> >>
>> >>
>> >> On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic <
>> andrija.panic@gmail.com> wrote:
>> >>> BIG +1
>> >>>
>> >>> My team should submit some patch to ACS for better KVM snapshots,
>> including
>> >>> whole VM snapshot etc...but it's too early to give details...
>> >>> best
>> >>>
>> >>> On 16 February 2015 at 13:01, Andrei Mikhailovsky <andrei@arhont.com>
>> wrote:
>> >>>
>> >>>> Hello guys,
>> >>>>
>> >>>> I was hoping to have some feedback from the community on the subject
>> of
>> >>>> having an ability to keep snapshots on the primary storage where
it
>> is
>> >>>> supported by the storage backend.
>> >>>>
>> >>>> The idea behind this functionality is to improve how snapshots are
>> >>>> currently handled on KVM hypervisors with Ceph primary storage.
At
>> the
>> >>>> moment, the snapshots are taken on the primary storage and being
>> copied to
>> >>>> the secondary storage. This method is very slow and inefficient
even
>> on
>> >>>> small infrastructure. Even on medium deployments using snapshots
in
>> KVM
>> >>>> becomes nearly impossible. If you have tens or hundreds concurrent
>> >>>> snapshots taking place you will have a bunch of timeouts and errors,
>> your
>> >>>> network becomes clogged, etc. In addition, using these snapshots
for
>> >>>> creating new volumes or reverting back vms also slow and
>> inefficient. As
>> >>>> above, when you have tens or hundreds concurrent operations it will
>> not
>> >>>> succeed and you will have a majority of tasks with errors or
>> timeouts.
>> >>>>
>> >>>> At the moment, taking a single snapshot of relatively small volumes
>> (200GB
>> >>>> or 500GB for instance) takes tens if not hundreds of minutes. Taking
>> a
>> >>>> snapshot of the same volume on ceph primary storage takes a few
>> seconds at
>> >>>> most! Similarly, converting a snapshot to a volume takes tens if
not
>> >>>> hundreds of minutes when secondary storage is involved; compared
with
>> >>>> seconds if done directly on the primary storage.
>> >>>>
>> >>>> I suggest that the CloudStack should have the ability to keep volume
>> >>>> snapshots on the primary storage where this is supported by the
>> storage.
>> >>>> Perhaps having a per primary storage setting that enables this
>> >>>> functionality. This will be beneficial for Ceph primary storage
on
>> KVM
>> >>>> hypervisors and perhaps on XenServer when Ceph will be supported
in
>> a near
>> >>>> future.
>> >>>>
>> >>>> This will greatly speed up the process of using snapshots on KVM
and
>> users
>> >>>> will actually start using snapshotting rather than giving up with
>> >>>> frustration.
>> >>>>
>> >>>> I have opened the ticket CLOUDSTACK-8256, so please cast your vote
>> if you
>> >>>> are in agreement.
>> >>>>
>> >>>> Thanks for your input
>> >>>>
>> >>>> Andrei
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>
>> >>>
>> >>> --
>> >>>
>> >>> Andrija Panić
>>
>
>
>
> --
> *Mike Tutkowski*
> *Senior CloudStack Developer, SolidFire Inc.*
> e: mike.tutkowski@solidfire.com
> o: 303.746.7302
> Advancing the way the world uses the cloud
> <http://solidfire.com/solution/overview/?video=play>*™*
>



-- 
*Mike Tutkowski*
*Senior CloudStack Developer, SolidFire Inc.*
e: mike.tutkowski@solidfire.com
o: 303.746.7302
Advancing the way the world uses the cloud
<http://solidfire.com/solution/overview/?video=play>*™*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message