From users-return-4063-daniel=haxx.se@subversion.apache.org Mon Aug 2 22:28:25 2010 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on giant.haxx.se X-Spam-Level: X-Spam-Status: No, score=-4.5 required=3.0 tests=BAYES_00,DS_FRIEND autolearn=ham version=3.3.1 Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by giant.haxx.se (8.14.3/8.14.3/Debian-9.1) with SMTP id o72KSNCC030496 for ; Mon, 2 Aug 2010 22:28:24 +0200 Received: (qmail 76698 invoked by uid 500); 2 Aug 2010 20:28:13 -0000 Mailing-List: contact users-help@subversion.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list users@subversion.apache.org Received: (qmail 76691 invoked by uid 99); 2 Aug 2010 20:28:13 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Aug 2010 20:28:13 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=10.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS Received-SPF: pass (nike.apache.org: local policy) Received: from [192.109.42.8] (HELO einhorn.in-berlin.de) (192.109.42.8) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Aug 2010 20:28:03 +0000 X-Envelope-From: stsp@stsp.name Received: from ted.stsp.name (ted.stsp.name [217.197.84.34]) by einhorn.in-berlin.de (8.13.6/8.13.6/Debian-1) with ESMTP id o72KRbDG001521 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Mon, 2 Aug 2010 22:27:38 +0200 Received: from ted.stsp.name (stsp@localhost [127.0.0.1]) by ted.stsp.name (8.14.3/8.14.3) with ESMTP id o72KRbPp001488; Mon, 2 Aug 2010 22:27:37 +0200 (CEST) Received: (from stsp@localhost) by ted.stsp.name (8.14.3/8.14.3/Submit) id o72KRbtp018339; Mon, 2 Aug 2010 22:27:37 +0200 (CEST) Date: Mon, 2 Aug 2010 22:27:36 +0200 From: Stefan Sperling To: "Vallon, Justin" Cc: "users@subversion.apache.org" Subject: Re: Support for filesystem snapshots (?) Message-ID: <20100802202736.GO3967@ted.stsp.name> Mail-Followup-To: "Vallon, Justin" , "users@subversion.apache.org" References: <6EC02A00CC9F684DAF4AF4084CA84D5F01C40CB5@DRMBX3.winmail.deshaw.com> <20100802161739.GI3967@ted.stsp.name> <4C56F278.9030904@gmail.com> <6EC02A00CC9F684DAF4AF4084CA84D5F01C40CBC@DRMBX3.winmail.deshaw.com> <20100802175621.GL3967@ted.stsp.name> <6EC02A00CC9F684DAF4AF4084CA84D5F01C40CD7@DRMBX3.winmail.deshaw.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6EC02A00CC9F684DAF4AF4084CA84D5F01C40CD7@DRMBX3.winmail.deshaw.com> User-Agent: Mutt/1.5.20 (2009-06-14) X-Scanned-By: MIMEDefang_at_IN-Berlin_e.V. on 192.109.42.8 X-Virus-Checked: Checked by ClamAV on apache.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.3.5 (giant.haxx.se [80.67.6.50]); Mon, 02 Aug 2010 22:28:25 +0200 (CEST) X-Friend: Friend On Mon, Aug 02, 2010 at 03:25:48PM -0400, Vallon, Justin wrote: > > E.g. Subversion's FSFS needs to create a revision file from the commit's > > transaction, and move the finalized revision file into place. > > After the revision file has been moved into place successfully, FSFS also > > updates the svn:date revision property and moves the revision properties file > > into place (or copies revprop data into an sqlite database if you use > > revprop packing). Then, it updates the 'current' file which contains the > > number of the current HEAD revision. If you use representation sharing to > > save disk space, the commit may involve further updates to yet another > > sqlite database. > > > > All these actions need to complete in order to have a consistent state. > > > > If you're interested in seeing the code that does this, look at the > > svn_fs_fs__commit() and commit_body() functions in > > http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c > > I see this is executed with a FS write lock. My concern would be > focused on the interaction between the commit code and any rollback > code. For example, if the commit dies (any any point during the > commit), what will be required to insure that the repository behaves > as if the commit never started? Will a repo cleanup be required; will > the next committer cleanup the partial rev automatically (ie: > overwrite stale files); will the repo be hopelessly inconsistent? I honestly didn't know so I went and asked. And learned something! users asking interesting questions: http://mail-archives.apache.org/mod_mbox/subversion-users/201008.mbox/%3C6EC02A00CC9F684DAF4AF4084CA84D5F01C40CD7@DRMBX3.winmail.deshaw.com%3E i dunno how fsfs behaves in face of an interrupted commit; whether or not it needs to be rolled back if you haven't touched current than the rev file will never be read and will be overwritten stsp: does that answer your question? i think so because the rev file of the following commit will have the same name to move things into place onto write lock only for revprop change and commit :-) so, using rsync for backup is fine? if you copy current first, yes what's hotcopy for then? just bdb? stsp: copying 'current' first ... :-) ok, so what happens if I don't copy current first? you can copy revs/ then a commit happens then you copy current so you don't have all of revs/ that current claims exist then I need to unwedge it by decrementing current right and hopefully you haven't just crossed a packing boundary eg if you want to decrement from 1002 to 999 and someone packed it already a bit more work So in the event that 'current' says you are at rN but the rev data in the repository is still at r'N-1', the repository will complain (I've tried that, "No such revision rN"), and you'll need to decrement the counter in 'current'. But otherwise, the repository will continue to work. Now, how does rsync, or a file-system snapshot, know to make sure that 'current' is always copied first? Even if you copy 'current' first manually, rsync might later overwrite it. But unless you use packing it's trivial to fix the backup if it breaks, and all you risk is losing the most recent HEAD revision, which you may not have gotten with a hotcopy anyway. Still, I think I'll keep advising people to use hotcopy. It avoids the problem with a too recent 'current' file, i.e. the backup is always usable out of the box. And who knows how Subversion's on-disk formats will at change in the future. The hotcopy approach will always be supported, and works fine if, as you pointed out, you can make sure that a hotcopy is being backup up while not being written to. Stefan