Return-Path: X-Original-To: apmail-cloudstack-dev-archive@www.apache.org Delivered-To: apmail-cloudstack-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 058BE1042D for ; Mon, 16 Feb 2015 21:19:18 +0000 (UTC) Received: (qmail 66377 invoked by uid 500); 16 Feb 2015 21:19:17 -0000 Delivered-To: apmail-cloudstack-dev-archive@cloudstack.apache.org Received: (qmail 66331 invoked by uid 500); 16 Feb 2015 21:19:17 -0000 Mailing-List: contact dev-help@cloudstack.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cloudstack.apache.org Delivered-To: mailing list dev@cloudstack.apache.org Received: (qmail 66282 invoked by uid 99); 16 Feb 2015 21:19:17 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Feb 2015 21:19:17 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of mike.tutkowski@solidfire.com designates 209.85.214.178 as permitted sender) Received: from [209.85.214.178] (HELO mail-ob0-f178.google.com) (209.85.214.178) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Feb 2015 21:19:13 +0000 Received: by mail-ob0-f178.google.com with SMTP id uz6so46376702obc.9 for ; Mon, 16 Feb 2015 13:18:07 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=PMq9334+ThVyxpCRRJ0azMpAD2HsRZx8ptRu+tqFAKw=; b=GZdSIC6A2xk9Ru085U2iR5j4hK8aogLlLIxZEdLS/MOXH7IZmiI5F9z3sW12hw47TN WpsnjefJvGylXLgdumk5XsBtKOizX5pUvpq8af4/xquS/fm5eOHxTJ2X6mG9MxqkKlw6 quzjOKx+GIcqyQXTWWb7913QIOsbc5Q0HCfax6OWjrL4oXMham5ib8ePXAOqWcQccP0w 3qpTSl3I5RpeNY0g1GXQkPF5QqMUxSHPc1LZAvJtKg3xPPoAbYldF+D5s0v5YrtO0AmL xBOsQUwWhzqH92UiwwUk753+hkfh9tKU2xMVq9VU3wn7Ss+JtX2xWisECINOl5ofgECn ikYw== X-Gm-Message-State: ALoCoQn1dtZSv66x4aL1n9xsK8k5kRiRy6RV/tBcnURDrdtB98YjlzPQFSxmoOxIuPBGeNv/vyiY X-Received: by 10.202.105.211 with SMTP id e202mr15550704oic.134.1424121486838; Mon, 16 Feb 2015 13:18:06 -0800 (PST) MIME-Version: 1.0 Received: by 10.182.65.227 with HTTP; Mon, 16 Feb 2015 13:17:26 -0800 (PST) In-Reply-To: References: <5256976.469.1424085958415.JavaMail.andrei@tuchka> <30302814.527.1424088070198.JavaMail.andrei@tuchka> <54E206E7.5010600@widodh.nl> From: Mike Tutkowski Date: Mon, 16 Feb 2015 14:17:26 -0700 Message-ID: Subject: Re: Your thoughts on using Primary Storage for keeping snapshots To: "dev@cloudstack.apache.org" Content-Type: multipart/alternative; boundary=001a114034e0572459050f3b1f18 X-Virus-Checked: Checked by ClamAV on apache.org --001a114034e0572459050f3b1f18 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Well...count me in on the general-purpose part (I'm already working on that and have much of it working). If someone is interested in implementing the RBD part, he/she can sync with me and see if there is any overlapping work that I've already implementing from a general-purpose standpoint. On Mon, Feb 16, 2015 at 1:39 PM, Ian Rae wrote: > Agree with Logan. As fans of Ceph as well as SolidFire, we are interested > in seeing this particular use case (RBD/KVM) being well implemented, > however the concept of volume snapshots residing only on primary storage = vs > being transferred to secondary storage is a more generally useful one tha= t > is worth solving with the same terminology and interfaces, even if the > mechanisms may be specific to the storage type and hypervisor. > > It its not practical then its not practical, but seems like it would be > worth trying. > > On Mon, Feb 16, 2015 at 1:02 PM, Logan Barfield > wrote: > > > Hi Mike, > > > > I agree it is a general CloudStack issue that can be addressed across > > multiple primary storage options. It's a two stage issue since some > > changes will need to be implemented to support these features across > > the board, and others will need to be made to each storage option. > > > > It would be nice to see a single issue opened to cover this across all > > available storage options. Maybe have a community vote on what > > support they want to see, and not consider the feature complete until > > all of the desired options are implemented? That would slow down > > development for sure, but it would ensure that it was supported where > > it needs to be. > > > > Thank You, > > > > Logan Barfield > > Tranquil Hosting > > > > > > On Mon, Feb 16, 2015 at 12:42 PM, Mike Tutkowski > > wrote: > > > For example, Punith from CloudByte sent out an e-mail yesterday that > was > > > very similar to this thread, but he was wondering how to implement > such a > > > concept on his company's SAN technology. > > > > > > On Mon, Feb 16, 2015 at 10:40 AM, Mike Tutkowski < > > > mike.tutkowski@solidfire.com> wrote: > > > > > >> Yeah, I think it's a similar concept, though. > > >> > > >> You would want to take snapshots on Ceph (or some other backend syst= em > > >> that acts as primary storage) instead of copying data to secondary > > storage > > >> and calling it a snapshot. > > >> > > >> For Ceph or any other backend system like that, the idea is to speed > up > > >> snapshots by not requiring CPU cycles on the front end or network > > bandwidth > > >> to transfer the data. > > >> > > >> In that sense, this is a general-purpose CloudStack problem and it > > appears > > >> you are intending on discussing only the Ceph implementation here, > > which is > > >> fine. > > >> > > >> On Mon, Feb 16, 2015 at 10:34 AM, Logan Barfield < > > lbarfield@tqhosting.com> > > >> wrote: > > >> > > >>> Hi Mike, > > >>> > > >>> I think the interest in this issue is primarily for Ceph RBD, which > > >>> doesn't use iSCSI or SAN concepts in general. As well I believe RB= D > > >>> is only currently supported in KVM (and VMware?). QEMU has native > RBD > > >>> support, so it attaches the devices directly to the VMs in question= . > > >>> It also natively supports snapshotting, which is what this discussi= on > > >>> is about. > > >>> > > >>> Thank You, > > >>> > > >>> Logan Barfield > > >>> Tranquil Hosting > > >>> > > >>> > > >>> On Mon, Feb 16, 2015 at 11:46 AM, Mike Tutkowski > > >>> wrote: > > >>> > I should have also commented on KVM (since that was the hyperviso= r > > >>> called > > >>> > out in the initial e-mail). > > >>> > > > >>> > In my situation, most of my customers use XenServer and/or ESXi, = so > > KVM > > >>> has > > >>> > received the fewest of my cycles with regards to those three > > >>> hypervisors. > > >>> > > > >>> > KVM, though, is actually the simplest hypervisor for which to > > implement > > >>> > these changes (since I am using the iSCSI adapter of the KVM agen= t > > and > > >>> it > > >>> > just essentially passes my LUN to the VM in question). > > >>> > > > >>> > For KVM, there is no clustered file system applied to my backend > LUN, > > >>> so I > > >>> > don't have to "worry" about that layer. > > >>> > > > >>> > I don't see any hurdles like *immutable* UUIDs of SRs and VDIs > (such > > is > > >>> the > > >>> > case with XenServer) or having to re-signature anything (such is > the > > >>> case > > >>> > with ESXi). > > >>> > > > >>> > On Mon, Feb 16, 2015 at 9:33 AM, Mike Tutkowski < > > >>> > mike.tutkowski@solidfire.com> wrote: > > >>> > > > >>> >> I have been working on this on and off for a while now (as time > > >>> permits). > > >>> >> > > >>> >> Here is an e-mail I sent to a customer of ours that helps descri= be > > >>> some of > > >>> >> the issues: > > >>> >> > > >>> >> *** Beginning of e-mail *** > > >>> >> > > >>> >> The main requests were around the following features: > > >>> >> > > >>> >> * The ability to leverage SolidFire snapshots. > > >>> >> > > >>> >> * The ability to create CloudStack templates from SolidFire > > snapshots. > > >>> >> > > >>> >> I had these on my roadmap, but bumped the priority up and began > > work on > > >>> >> them for the CS 4.6 release. > > >>> >> > > >>> >> During design, I realized there were issues with the way XenServ= er > > is > > >>> >> architected that prevented me from directly using SolidFire > > snapshots. > > >>> >> > > >>> >> I could definitely take a SolidFire snapshot of a SolidFire > volume, > > but > > >>> >> this snapshot would not be usable from XenServer if the original > > >>> volume was > > >>> >> still in use. > > >>> >> > > >>> >> Here is the gist of the problem: > > >>> >> > > >>> >> When XenServer leverages an iSCSI target such as a SolidFire > > volume, it > > >>> >> applies a clustered files system to it, which they call a storag= e > > >>> >> repository (SR). An SR has an *immutable* UUID associated with i= t. > > >>> >> > > >>> >> The virtual volume (which a VM sees as a disk) is represented by= a > > >>> virtual > > >>> >> disk image (VDI) in the SR. A VDI also has an *immutable* UUID > > >>> associated > > >>> >> with it. > > >>> >> > > >>> >> If I take a snapshot (or a clone) of the SolidFire volume and th= en > > >>> later > > >>> >> try to use that snapshot from XenServer, XenServer complains tha= t > > the > > >>> SR on > > >>> >> the snapshot has a UUID that conflicts with an existing UUID. > > >>> >> > > >>> >> In other words, it is not possible to use the original SR and th= e > > >>> snapshot > > >>> >> of this SR from XenServer at the same time, which is critical in= a > > >>> cloud > > >>> >> environment (to enable creating templates from snapshots). > > >>> >> > > >>> >> The way I have proposed circumventing this issue is not ideal, b= ut > > >>> >> technically works (this code is checked into the CS 4.6 branch): > > >>> >> > > >>> >> When the time comes to take a CloudStack snapshot of a CloudStac= k > > >>> volume > > >>> >> that is backed by SolidFire storage via the storage plug-in, the > > >>> plug-in > > >>> >> will create a new SolidFire volume with characteristics (size an= d > > IOPS) > > >>> >> equal to those of the original volume. > > >>> >> > > >>> >> We then have XenServer attach to this new SolidFire volume, > create a > > >>> *new* > > >>> >> SR on it, and then copy the VDI from the source SR to the > > destination > > >>> SR > > >>> >> (the new SR). > > >>> >> > > >>> >> This leads to us having a copy of the VDI (a "snapshot" of sorts= ), > > but > > >>> it > > >>> >> requires CPU cycles on the compute cluster as well as network > > >>> bandwidth to > > >>> >> write to the SAN (thus it is slower and more resource intensive > > than a > > >>> >> SolidFire snapshot). > > >>> >> > > >>> >> I spoke with Tim Mackey (who works on XenServer at Citrix) > > concerning > > >>> this > > >>> >> issue before and during the CloudStack Collaboration Conference = in > > >>> Budapest > > >>> >> in November. He agreed that this is a legitimate issue with the > way > > >>> >> XenServer is designed and could not think of a way (other than > what > > I > > >>> was > > >>> >> doing) to get around it in current versions of XenServer. > > >>> >> > > >>> >> One thought is to have a feature added to XenServer that enables > > you to > > >>> >> change the UUID of an SR and of a VDI. > > >>> >> > > >>> >> If I could do that, then I could take a SolidFire snapshot of th= e > > >>> >> SolidFire volume and issue commands to XenServer to have it chan= ge > > the > > >>> >> UUIDs of the original SR and the original VDI. I could then > recored > > the > > >>> >> necessary UUID info in the CS DB. > > >>> >> > > >>> >> *** End of e-mail *** > > >>> >> > > >>> >> I have since investigated this on ESXi. > > >>> >> > > >>> >> ESXi does have a way for us to "re-signature" a datastore, so > > backend > > >>> >> snapshots can be taken and effectively used on this hypervisor. > > >>> >> > > >>> >> On Mon, Feb 16, 2015 at 8:19 AM, Logan Barfield < > > >>> lbarfield@tqhosting.com> > > >>> >> wrote: > > >>> >> > > >>> >>> I'm just going to stick with the qemu-img option change for RBD > for > > >>> >>> now (which should cut snapshot time down drastically), and look > > >>> >>> forward to this in the future. I'd be happy to help get this > > moving, > > >>> >>> but I'm not enough of a developer to lead the charge. > > >>> >>> > > >>> >>> As far as renaming goes, I agree that maybe backups isn't the > right > > >>> >>> word. That being said calling a full-sized copy of a volume a > > >>> >>> "snapshot" also isn't the right word. Maybe "image" would be > > better? > > >>> >>> > > >>> >>> I've also got my reservations about "accounts" vs "users" (I > think > > >>> >>> "departments" and "accounts or users" respectively is less > > confusing), > > >>> >>> but that's a different thread. > > >>> >>> > > >>> >>> Thank You, > > >>> >>> > > >>> >>> Logan Barfield > > >>> >>> Tranquil Hosting > > >>> >>> > > >>> >>> > > >>> >>> On Mon, Feb 16, 2015 at 10:04 AM, Wido den Hollander < > > wido@widodh.nl> > > >>> >>> wrote: > > >>> >>> > > > >>> >>> > > > >>> >>> > On 16-02-15 15:38, Logan Barfield wrote: > > >>> >>> >> I like this idea a lot for Ceph RBD. I do think there shoul= d > > >>> still be > > >>> >>> >> support for copying snapshots to secondary storage as needed > > (for > > >>> >>> >> transfers between zones, etc.). I really think that this > could > > be > > >>> >>> >> part of a larger move to clarify the naming conventions used > for > > >>> disk > > >>> >>> >> operations. Currently "Volume Snapshots" should probably > > really be > > >>> >>> >> called "Backups". So having "snapshot" functionality, and a > > >>> "convert > > >>> >>> >> snapshot to backup/template" would be a good move. > > >>> >>> >> > > >>> >>> > > > >>> >>> > I fully agree that this would be a very great addition. > > >>> >>> > > > >>> >>> > I won't be able to work on this any time soon though. > > >>> >>> > > > >>> >>> > Wido > > >>> >>> > > > >>> >>> >> Thank You, > > >>> >>> >> > > >>> >>> >> Logan Barfield > > >>> >>> >> Tranquil Hosting > > >>> >>> >> > > >>> >>> >> > > >>> >>> >> On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic < > > >>> >>> andrija.panic@gmail.com> wrote: > > >>> >>> >>> BIG +1 > > >>> >>> >>> > > >>> >>> >>> My team should submit some patch to ACS for better KVM > > snapshots, > > >>> >>> including > > >>> >>> >>> whole VM snapshot etc...but it's too early to give details.= .. > > >>> >>> >>> best > > >>> >>> >>> > > >>> >>> >>> On 16 February 2015 at 13:01, Andrei Mikhailovsky < > > >>> andrei@arhont.com> > > >>> >>> wrote: > > >>> >>> >>> > > >>> >>> >>>> Hello guys, > > >>> >>> >>>> > > >>> >>> >>>> I was hoping to have some feedback from the community on t= he > > >>> subject > > >>> >>> of > > >>> >>> >>>> having an ability to keep snapshots on the primary storage > > where > > >>> it > > >>> >>> is > > >>> >>> >>>> supported by the storage backend. > > >>> >>> >>>> > > >>> >>> >>>> The idea behind this functionality is to improve how > snapshots > > >>> are > > >>> >>> >>>> currently handled on KVM hypervisors with Ceph primary > > storage. > > >>> At > > >>> >>> the > > >>> >>> >>>> moment, the snapshots are taken on the primary storage and > > being > > >>> >>> copied to > > >>> >>> >>>> the secondary storage. This method is very slow and > > inefficient > > >>> even > > >>> >>> on > > >>> >>> >>>> small infrastructure. Even on medium deployments using > > snapshots > > >>> in > > >>> >>> KVM > > >>> >>> >>>> becomes nearly impossible. If you have tens or hundreds > > >>> concurrent > > >>> >>> >>>> snapshots taking place you will have a bunch of timeouts a= nd > > >>> errors, > > >>> >>> your > > >>> >>> >>>> network becomes clogged, etc. In addition, using these > > snapshots > > >>> for > > >>> >>> >>>> creating new volumes or reverting back vms also slow and > > >>> >>> inefficient. As > > >>> >>> >>>> above, when you have tens or hundreds concurrent operation= s > it > > >>> will > > >>> >>> not > > >>> >>> >>>> succeed and you will have a majority of tasks with errors = or > > >>> >>> timeouts. > > >>> >>> >>>> > > >>> >>> >>>> At the moment, taking a single snapshot of relatively smal= l > > >>> volumes > > >>> >>> (200GB > > >>> >>> >>>> or 500GB for instance) takes tens if not hundreds of > minutes. > > >>> Taking > > >>> >>> a > > >>> >>> >>>> snapshot of the same volume on ceph primary storage takes = a > > few > > >>> >>> seconds at > > >>> >>> >>>> most! Similarly, converting a snapshot to a volume takes > tens > > if > > >>> not > > >>> >>> >>>> hundreds of minutes when secondary storage is involved; > > compared > > >>> with > > >>> >>> >>>> seconds if done directly on the primary storage. > > >>> >>> >>>> > > >>> >>> >>>> I suggest that the CloudStack should have the ability to > keep > > >>> volume > > >>> >>> >>>> snapshots on the primary storage where this is supported b= y > > the > > >>> >>> storage. > > >>> >>> >>>> Perhaps having a per primary storage setting that enables > this > > >>> >>> >>>> functionality. This will be beneficial for Ceph primary > > storage > > >>> on > > >>> >>> KVM > > >>> >>> >>>> hypervisors and perhaps on XenServer when Ceph will be > > supported > > >>> in > > >>> >>> a near > > >>> >>> >>>> future. > > >>> >>> >>>> > > >>> >>> >>>> This will greatly speed up the process of using snapshots = on > > KVM > > >>> and > > >>> >>> users > > >>> >>> >>>> will actually start using snapshotting rather than giving = up > > with > > >>> >>> >>>> frustration. > > >>> >>> >>>> > > >>> >>> >>>> I have opened the ticket CLOUDSTACK-8256, so please cast > your > > >>> vote > > >>> >>> if you > > >>> >>> >>>> are in agreement. > > >>> >>> >>>> > > >>> >>> >>>> Thanks for your input > > >>> >>> >>>> > > >>> >>> >>>> Andrei > > >>> >>> >>>> > > >>> >>> >>>> > > >>> >>> >>>> > > >>> >>> >>>> > > >>> >>> >>>> > > >>> >>> >>> > > >>> >>> >>> > > >>> >>> >>> -- > > >>> >>> >>> > > >>> >>> >>> Andrija Pani=C4=87 > > >>> >>> > > >>> >> > > >>> >> > > >>> >> > > >>> >> -- > > >>> >> *Mike Tutkowski* > > >>> >> *Senior CloudStack Developer, SolidFire Inc.* > > >>> >> e: mike.tutkowski@solidfire.com > > >>> >> o: 303.746.7302 > > >>> >> Advancing the way the world uses the cloud > > >>> >> *=E2=84=A2= * > > >>> >> > > >>> > > > >>> > > > >>> > > > >>> > -- > > >>> > *Mike Tutkowski* > > >>> > *Senior CloudStack Developer, SolidFire Inc.* > > >>> > e: mike.tutkowski@solidfire.com > > >>> > o: 303.746.7302 > > >>> > Advancing the way the world uses the cloud > > >>> > *=E2=84=A2* > > >>> > > >> > > >> > > >> > > >> -- > > >> *Mike Tutkowski* > > >> *Senior CloudStack Developer, SolidFire Inc.* > > >> e: mike.tutkowski@solidfire.com > > >> o: 303.746.7302 > > >> Advancing the way the world uses the cloud > > >> *=E2=84=A2* > > >> > > > > > > > > > > > > -- > > > *Mike Tutkowski* > > > *Senior CloudStack Developer, SolidFire Inc.* > > > e: mike.tutkowski@solidfire.com > > > o: 303.746.7302 > > > Advancing the way the world uses the cloud > > > *=E2=84=A2* > > > > > > -- > *Ian Rae* > PDG *| *CEO > t *514.944.4008* > > *CloudOps* Votre partenaire infonuagique* | *Cloud Solutions Experts > w cloudops.com *|* 420 rue Guy *|* Montreal *|= * > Quebec *|* H3J 1S6 > > > < > http://www.cloudops.com/2014/11/cloudops-tops-deloittes-technology-fast-5= 0/ > > > --=20 *Mike Tutkowski* *Senior CloudStack Developer, SolidFire Inc.* e: mike.tutkowski@solidfire.com o: 303.746.7302 Advancing the way the world uses the cloud *=E2=84=A2* --001a114034e0572459050f3b1f18--