cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Tutkowski <mike.tutkow...@solidfire.com>
Subject Re: Your thoughts on using Primary Storage for keeping snapshots
Date Mon, 16 Feb 2015 21:17:26 GMT
Well...count me in on the general-purpose part (I'm already working on that
and have much of it working).

If someone is interested in implementing the RBD part, he/she can sync with
me and see if there is any overlapping work that I've already implementing
from a general-purpose standpoint.

On Mon, Feb 16, 2015 at 1:39 PM, Ian Rae <irae@cloudops.com> wrote:

> Agree with Logan. As fans of Ceph as well as SolidFire, we are interested
> in seeing this particular use case (RBD/KVM) being well implemented,
> however the concept of volume snapshots residing only on primary storage vs
> being transferred to secondary storage is a more generally useful one that
> is worth solving with the same terminology and interfaces, even if the
> mechanisms may be specific to the storage type and hypervisor.
>
> It its not practical then its not practical, but seems like it would be
> worth trying.
>
> On Mon, Feb 16, 2015 at 1:02 PM, Logan Barfield <lbarfield@tqhosting.com>
> wrote:
>
> > Hi Mike,
> >
> > I agree it is a general CloudStack issue that can be addressed across
> > multiple primary storage options.  It's a two stage issue since some
> > changes will need to be implemented to support these features across
> > the board, and others will need to be made to each storage option.
> >
> > It would be nice to see a single issue opened to cover this across all
> > available storage options.  Maybe have a community vote on what
> > support they want to see, and not consider the feature complete until
> > all of the desired options are implemented?  That would slow down
> > development for sure, but it would ensure that it was supported where
> > it needs to be.
> >
> > Thank You,
> >
> > Logan Barfield
> > Tranquil Hosting
> >
> >
> > On Mon, Feb 16, 2015 at 12:42 PM, Mike Tutkowski
> > <mike.tutkowski@solidfire.com> wrote:
> > > For example, Punith from CloudByte sent out an e-mail yesterday that
> was
> > > very similar to this thread, but he was wondering how to implement
> such a
> > > concept on his company's SAN technology.
> > >
> > > On Mon, Feb 16, 2015 at 10:40 AM, Mike Tutkowski <
> > > mike.tutkowski@solidfire.com> wrote:
> > >
> > >> Yeah, I think it's a similar concept, though.
> > >>
> > >> You would want to take snapshots on Ceph (or some other backend system
> > >> that acts as primary storage) instead of copying data to secondary
> > storage
> > >> and calling it a snapshot.
> > >>
> > >> For Ceph or any other backend system like that, the idea is to speed
> up
> > >> snapshots by not requiring CPU cycles on the front end or network
> > bandwidth
> > >> to transfer the data.
> > >>
> > >> In that sense, this is a general-purpose CloudStack problem and it
> > appears
> > >> you are intending on discussing only the Ceph implementation here,
> > which is
> > >> fine.
> > >>
> > >> On Mon, Feb 16, 2015 at 10:34 AM, Logan Barfield <
> > lbarfield@tqhosting.com>
> > >> wrote:
> > >>
> > >>> Hi Mike,
> > >>>
> > >>> I think the interest in this issue is primarily for Ceph RBD, which
> > >>> doesn't use iSCSI or SAN concepts in general.  As well I believe RBD
> > >>> is only currently supported in KVM (and VMware?).  QEMU has native
> RBD
> > >>> support, so it attaches the devices directly to the VMs in question.
> > >>> It also natively supports snapshotting, which is what this discussion
> > >>> is about.
> > >>>
> > >>> Thank You,
> > >>>
> > >>> Logan Barfield
> > >>> Tranquil Hosting
> > >>>
> > >>>
> > >>> On Mon, Feb 16, 2015 at 11:46 AM, Mike Tutkowski
> > >>> <mike.tutkowski@solidfire.com> wrote:
> > >>> > I should have also commented on KVM (since that was the hypervisor
> > >>> called
> > >>> > out in the initial e-mail).
> > >>> >
> > >>> > In my situation, most of my customers use XenServer and/or ESXi,
so
> > KVM
> > >>> has
> > >>> > received the fewest of my cycles with regards to those three
> > >>> hypervisors.
> > >>> >
> > >>> > KVM, though, is actually the simplest hypervisor for which to
> > implement
> > >>> > these changes (since I am using the iSCSI adapter of the KVM agent
> > and
> > >>> it
> > >>> > just essentially passes my LUN to the VM in question).
> > >>> >
> > >>> > For KVM, there is no clustered file system applied to my backend
> LUN,
> > >>> so I
> > >>> > don't have to "worry" about that layer.
> > >>> >
> > >>> > I don't see any hurdles like *immutable* UUIDs of SRs and VDIs
> (such
> > is
> > >>> the
> > >>> > case with XenServer) or having to re-signature anything (such
is
> the
> > >>> case
> > >>> > with ESXi).
> > >>> >
> > >>> > On Mon, Feb 16, 2015 at 9:33 AM, Mike Tutkowski <
> > >>> > mike.tutkowski@solidfire.com> wrote:
> > >>> >
> > >>> >> I have been working on this on and off for a while now (as
time
> > >>> permits).
> > >>> >>
> > >>> >> Here is an e-mail I sent to a customer of ours that helps
describe
> > >>> some of
> > >>> >> the issues:
> > >>> >>
> > >>> >> *** Beginning of e-mail ***
> > >>> >>
> > >>> >> The main requests were around the following features:
> > >>> >>
> > >>> >> * The ability to leverage SolidFire snapshots.
> > >>> >>
> > >>> >> * The ability to create CloudStack templates from SolidFire
> > snapshots.
> > >>> >>
> > >>> >> I had these on my roadmap, but bumped the priority up and
began
> > work on
> > >>> >> them for the CS 4.6 release.
> > >>> >>
> > >>> >> During design, I realized there were issues with the way XenServer
> > is
> > >>> >> architected that prevented me from directly using SolidFire
> > snapshots.
> > >>> >>
> > >>> >> I could definitely take a SolidFire snapshot of a SolidFire
> volume,
> > but
> > >>> >> this snapshot would not be usable from XenServer if the original
> > >>> volume was
> > >>> >> still in use.
> > >>> >>
> > >>> >> Here is the gist of the problem:
> > >>> >>
> > >>> >> When XenServer leverages an iSCSI target such as a SolidFire
> > volume, it
> > >>> >> applies a clustered files system to it, which they call a
storage
> > >>> >> repository (SR). An SR has an *immutable* UUID associated
with it.
> > >>> >>
> > >>> >> The virtual volume (which a VM sees as a disk) is represented
by a
> > >>> virtual
> > >>> >> disk image (VDI) in the SR. A VDI also has an *immutable*
UUID
> > >>> associated
> > >>> >> with it.
> > >>> >>
> > >>> >> If I take a snapshot (or a clone) of the SolidFire volume
and then
> > >>> later
> > >>> >> try to use that snapshot from XenServer, XenServer complains
that
> > the
> > >>> SR on
> > >>> >> the snapshot has a UUID that conflicts with an existing UUID.
> > >>> >>
> > >>> >> In other words, it is not possible to use the original SR
and the
> > >>> snapshot
> > >>> >> of this SR from XenServer at the same time, which is critical
in a
> > >>> cloud
> > >>> >> environment (to enable creating templates from snapshots).
> > >>> >>
> > >>> >> The way I have proposed circumventing this issue is not ideal,
but
> > >>> >> technically works (this code is checked into the CS 4.6 branch):
> > >>> >>
> > >>> >> When the time comes to take a CloudStack snapshot of a CloudStack
> > >>> volume
> > >>> >> that is backed by SolidFire storage via the storage plug-in,
the
> > >>> plug-in
> > >>> >> will create a new SolidFire volume with characteristics (size
and
> > IOPS)
> > >>> >> equal to those of the original volume.
> > >>> >>
> > >>> >> We then have XenServer attach to this new SolidFire volume,
> create a
> > >>> *new*
> > >>> >> SR on it, and then copy the VDI from the source SR to the
> > destination
> > >>> SR
> > >>> >> (the new SR).
> > >>> >>
> > >>> >> This leads to us having a copy of the VDI (a "snapshot" of
sorts),
> > but
> > >>> it
> > >>> >> requires CPU cycles on the compute cluster as well as network
> > >>> bandwidth to
> > >>> >> write to the SAN (thus it is slower and more resource intensive
> > than a
> > >>> >> SolidFire snapshot).
> > >>> >>
> > >>> >> I spoke with Tim Mackey (who works on XenServer at Citrix)
> > concerning
> > >>> this
> > >>> >> issue before and during the CloudStack Collaboration Conference
in
> > >>> Budapest
> > >>> >> in November. He agreed that this is a legitimate issue with
the
> way
> > >>> >> XenServer is designed and could not think of a way (other
than
> what
> > I
> > >>> was
> > >>> >> doing) to get around it in current versions of XenServer.
> > >>> >>
> > >>> >> One thought is to have a feature added to XenServer that enables
> > you to
> > >>> >> change the UUID of an SR and of a VDI.
> > >>> >>
> > >>> >> If I could do that, then I could take a SolidFire snapshot
of the
> > >>> >> SolidFire volume and issue commands to XenServer to have it
change
> > the
> > >>> >> UUIDs of the original SR and the original VDI. I could then
> recored
> > the
> > >>> >> necessary UUID info in the CS DB.
> > >>> >>
> > >>> >> *** End of e-mail ***
> > >>> >>
> > >>> >> I have since investigated this on ESXi.
> > >>> >>
> > >>> >> ESXi does have a way for us to "re-signature" a datastore,
so
> > backend
> > >>> >> snapshots can be taken and effectively used on this hypervisor.
> > >>> >>
> > >>> >> On Mon, Feb 16, 2015 at 8:19 AM, Logan Barfield <
> > >>> lbarfield@tqhosting.com>
> > >>> >> wrote:
> > >>> >>
> > >>> >>> I'm just going to stick with the qemu-img option change
for RBD
> for
> > >>> >>> now (which should cut snapshot time down drastically),
and look
> > >>> >>> forward to this in the future.  I'd be happy to help get
this
> > moving,
> > >>> >>> but I'm not enough of a developer to lead the charge.
> > >>> >>>
> > >>> >>> As far as renaming goes, I agree that maybe backups isn't
the
> right
> > >>> >>> word.  That being said calling a full-sized copy of a
volume a
> > >>> >>> "snapshot" also isn't the right word.  Maybe "image" would
be
> > better?
> > >>> >>>
> > >>> >>> I've also got my reservations about "accounts" vs "users"
(I
> think
> > >>> >>> "departments" and "accounts or users" respectively is
less
> > confusing),
> > >>> >>> but that's a different thread.
> > >>> >>>
> > >>> >>> Thank You,
> > >>> >>>
> > >>> >>> Logan Barfield
> > >>> >>> Tranquil Hosting
> > >>> >>>
> > >>> >>>
> > >>> >>> On Mon, Feb 16, 2015 at 10:04 AM, Wido den Hollander <
> > wido@widodh.nl>
> > >>> >>> wrote:
> > >>> >>> >
> > >>> >>> >
> > >>> >>> > On 16-02-15 15:38, Logan Barfield wrote:
> > >>> >>> >> I like this idea a lot for Ceph RBD.  I do think
there should
> > >>> still be
> > >>> >>> >> support for copying snapshots to secondary storage
as needed
> > (for
> > >>> >>> >> transfers between zones, etc.).  I really think
that this
> could
> > be
> > >>> >>> >> part of a larger move to clarify the naming conventions
used
> for
> > >>> disk
> > >>> >>> >> operations.  Currently "Volume Snapshots" should
probably
> > really be
> > >>> >>> >> called "Backups".  So having "snapshot" functionality,
and a
> > >>> "convert
> > >>> >>> >> snapshot to backup/template" would be a good
move.
> > >>> >>> >>
> > >>> >>> >
> > >>> >>> > I fully agree that this would be a very great addition.
> > >>> >>> >
> > >>> >>> > I won't be able to work on this any time soon though.
> > >>> >>> >
> > >>> >>> > Wido
> > >>> >>> >
> > >>> >>> >> Thank You,
> > >>> >>> >>
> > >>> >>> >> Logan Barfield
> > >>> >>> >> Tranquil Hosting
> > >>> >>> >>
> > >>> >>> >>
> > >>> >>> >> On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic
<
> > >>> >>> andrija.panic@gmail.com> wrote:
> > >>> >>> >>> BIG +1
> > >>> >>> >>>
> > >>> >>> >>> My team should submit some patch to ACS for
better KVM
> > snapshots,
> > >>> >>> including
> > >>> >>> >>> whole VM snapshot etc...but it's too early
to give details...
> > >>> >>> >>> best
> > >>> >>> >>>
> > >>> >>> >>> On 16 February 2015 at 13:01, Andrei Mikhailovsky
<
> > >>> andrei@arhont.com>
> > >>> >>> wrote:
> > >>> >>> >>>
> > >>> >>> >>>> Hello guys,
> > >>> >>> >>>>
> > >>> >>> >>>> I was hoping to have some feedback from
the community on the
> > >>> subject
> > >>> >>> of
> > >>> >>> >>>> having an ability to keep snapshots on
the primary storage
> > where
> > >>> it
> > >>> >>> is
> > >>> >>> >>>> supported by the storage backend.
> > >>> >>> >>>>
> > >>> >>> >>>> The idea behind this functionality is
to improve how
> snapshots
> > >>> are
> > >>> >>> >>>> currently handled on KVM hypervisors
with Ceph primary
> > storage.
> > >>> At
> > >>> >>> the
> > >>> >>> >>>> moment, the snapshots are taken on the
primary storage and
> > being
> > >>> >>> copied to
> > >>> >>> >>>> the secondary storage. This method is
very slow and
> > inefficient
> > >>> even
> > >>> >>> on
> > >>> >>> >>>> small infrastructure. Even on medium
deployments using
> > snapshots
> > >>> in
> > >>> >>> KVM
> > >>> >>> >>>> becomes nearly impossible. If you have
tens or hundreds
> > >>> concurrent
> > >>> >>> >>>> snapshots taking place you will have
a bunch of timeouts and
> > >>> errors,
> > >>> >>> your
> > >>> >>> >>>> network becomes clogged, etc. In addition,
using these
> > snapshots
> > >>> for
> > >>> >>> >>>> creating new volumes or reverting back
vms also slow and
> > >>> >>> inefficient. As
> > >>> >>> >>>> above, when you have tens or hundreds
concurrent operations
> it
> > >>> will
> > >>> >>> not
> > >>> >>> >>>> succeed and you will have a majority
of tasks with errors or
> > >>> >>> timeouts.
> > >>> >>> >>>>
> > >>> >>> >>>> At the moment, taking a single snapshot
of relatively small
> > >>> volumes
> > >>> >>> (200GB
> > >>> >>> >>>> or 500GB for instance) takes tens if
not hundreds of
> minutes.
> > >>> Taking
> > >>> >>> a
> > >>> >>> >>>> snapshot of the same volume on ceph primary
storage takes a
> > few
> > >>> >>> seconds at
> > >>> >>> >>>> most! Similarly, converting a snapshot
to a volume takes
> tens
> > if
> > >>> not
> > >>> >>> >>>> hundreds of minutes when secondary storage
is involved;
> > compared
> > >>> with
> > >>> >>> >>>> seconds if done directly on the primary
storage.
> > >>> >>> >>>>
> > >>> >>> >>>> I suggest that the CloudStack should
have the ability to
> keep
> > >>> volume
> > >>> >>> >>>> snapshots on the primary storage where
this is supported by
> > the
> > >>> >>> storage.
> > >>> >>> >>>> Perhaps having a per primary storage
setting that enables
> this
> > >>> >>> >>>> functionality. This will be beneficial
for Ceph primary
> > storage
> > >>> on
> > >>> >>> KVM
> > >>> >>> >>>> hypervisors and perhaps on XenServer
when Ceph will be
> > supported
> > >>> in
> > >>> >>> a near
> > >>> >>> >>>> future.
> > >>> >>> >>>>
> > >>> >>> >>>> This will greatly speed up the process
of using snapshots on
> > KVM
> > >>> and
> > >>> >>> users
> > >>> >>> >>>> will actually start using snapshotting
rather than giving up
> > with
> > >>> >>> >>>> frustration.
> > >>> >>> >>>>
> > >>> >>> >>>> I have opened the ticket CLOUDSTACK-8256,
so please cast
> your
> > >>> vote
> > >>> >>> if you
> > >>> >>> >>>> are in agreement.
> > >>> >>> >>>>
> > >>> >>> >>>> Thanks for your input
> > >>> >>> >>>>
> > >>> >>> >>>> Andrei
> > >>> >>> >>>>
> > >>> >>> >>>>
> > >>> >>> >>>>
> > >>> >>> >>>>
> > >>> >>> >>>>
> > >>> >>> >>>
> > >>> >>> >>>
> > >>> >>> >>> --
> > >>> >>> >>>
> > >>> >>> >>> Andrija Panić
> > >>> >>>
> > >>> >>
> > >>> >>
> > >>> >>
> > >>> >> --
> > >>> >> *Mike Tutkowski*
> > >>> >> *Senior CloudStack Developer, SolidFire Inc.*
> > >>> >> e: mike.tutkowski@solidfire.com
> > >>> >> o: 303.746.7302
> > >>> >> Advancing the way the world uses the cloud
> > >>> >> <http://solidfire.com/solution/overview/?video=play>*™*
> > >>> >>
> > >>> >
> > >>> >
> > >>> >
> > >>> > --
> > >>> > *Mike Tutkowski*
> > >>> > *Senior CloudStack Developer, SolidFire Inc.*
> > >>> > e: mike.tutkowski@solidfire.com
> > >>> > o: 303.746.7302
> > >>> > Advancing the way the world uses the cloud
> > >>> > <http://solidfire.com/solution/overview/?video=play>*™*
> > >>>
> > >>
> > >>
> > >>
> > >> --
> > >> *Mike Tutkowski*
> > >> *Senior CloudStack Developer, SolidFire Inc.*
> > >> e: mike.tutkowski@solidfire.com
> > >> o: 303.746.7302
> > >> Advancing the way the world uses the cloud
> > >> <http://solidfire.com/solution/overview/?video=play>*™*
> > >>
> > >
> > >
> > >
> > > --
> > > *Mike Tutkowski*
> > > *Senior CloudStack Developer, SolidFire Inc.*
> > > e: mike.tutkowski@solidfire.com
> > > o: 303.746.7302
> > > Advancing the way the world uses the cloud
> > > <http://solidfire.com/solution/overview/?video=play>*™*
> >
>
>
>
> --
> *Ian Rae*
> PDG *| *CEO
> t *514.944.4008*
>
> *CloudOps* Votre partenaire infonuagique* | *Cloud Solutions Experts
> w cloudops.com <http://www.cloudops.com/> *|* 420 rue Guy *|* Montreal *|*
>  Quebec *|* H3J 1S6
>
> <https://www.cloud.ca/>
> <
> http://www.cloudops.com/2014/11/cloudops-tops-deloittes-technology-fast-50/
> >
>



-- 
*Mike Tutkowski*
*Senior CloudStack Developer, SolidFire Inc.*
e: mike.tutkowski@solidfire.com
o: 303.746.7302
Advancing the way the world uses the cloud
<http://solidfire.com/solution/overview/?video=play>*™*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message