Return-Path: Delivered-To: apmail-incubator-couchdb-user-archive@locus.apache.org Received: (qmail 6003 invoked from network); 17 Mar 2008 19:53:38 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 17 Mar 2008 19:53:38 -0000 Received: (qmail 68240 invoked by uid 500); 17 Mar 2008 19:53:36 -0000 Delivered-To: apmail-incubator-couchdb-user-archive@incubator.apache.org Received: (qmail 68217 invoked by uid 500); 17 Mar 2008 19:53:36 -0000 Mailing-List: contact couchdb-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: couchdb-user@incubator.apache.org Delivered-To: mailing list couchdb-user@incubator.apache.org Received: (qmail 68208 invoked by uid 99); 17 Mar 2008 19:53:36 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Mar 2008 12:53:35 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of damienkatz@gmail.com designates 64.233.166.176 as permitted sender) Received: from [64.233.166.176] (HELO py-out-1112.google.com) (64.233.166.176) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Mar 2008 19:52:57 +0000 Received: by py-out-1112.google.com with SMTP id a25so7287116pyi.13 for ; Mon, 17 Mar 2008 12:53:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:from:to:in-reply-to:content-type:content-transfer-encoding:mime-version:subject:date:references:x-mailer; bh=iO+e/bHYdujk7J8CvfoXql4kFA2n8x8fCr/AykrW2cc=; b=jnMR8lajqEdTRXGtxBX3WcB7wR6hVnrLFnXrkGCYYj9xTRg+CDs8lJqdpQA5/k2xmoMakkKE/8AQBWFI5RQsR87hbbXtL2HYiItKjeImiNYTKvYdAMAgkxSvgrVPHnwBimLWt9Mzusj86mwm9YHM+hImIHbtoZIou52zqsGBcuI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:from:to:in-reply-to:content-type:content-transfer-encoding:mime-version:subject:date:references:x-mailer; b=puOTMwVjsWrZ85ni43pSneNq4AQxv/zbHij5bV8FKGb3maGDWUnE9KnDCWg8XjM9yPwCQtunOK67fmTah5dC5c+/7OVnNrsAitFrbXayx2nKLmP6/02uZUtJYWZwEMLmTpmsE6H8iV4Z9S2Z8IIO0tjq+YHOX6upo85CyXOcLZo= Received: by 10.35.68.3 with SMTP id v3mr989724pyk.32.1205783569158; Mon, 17 Mar 2008 12:52:49 -0700 (PDT) Received: from ?10.0.1.188? ( [71.68.49.63]) by mx.google.com with ESMTPS id n29sm54039401pyh.38.2008.03.17.12.52.47 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 17 Mar 2008 12:52:48 -0700 (PDT) Message-Id: From: Damien Katz To: couchdb-user@incubator.apache.org In-Reply-To: <47DEBD13.30303@theopenlearningcentre.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v919.2) Subject: Re: Relying on revisions for rollbacks Date: Mon, 17 Mar 2008 15:52:46 -0400 References: <1205753351.7678.18.camel@localhost> <828B1624-4992-4BE5-ADED-EB252DC88BD8@apache.org> <2DE218F0-6231-4382-B6BA-3638020D2851@apache.org> <47DEBD13.30303@theopenlearningcentre.com> X-Mailer: Apple Mail (2.919.2) X-Virus-Checked: Checked by ClamAV on apache.org On Mar 17, 2008, at 2:48 PM, Alan Bell wrote: > Jan Lehnardt wrote: >> >> You can do that, too. With attachments, you'd have it all in one >> place and would not need to write your views in a way that they >> don't pick up old revisions. That said, it is certainly possible to >> store older revisions in other documents, if that solves your >> problems. >> >> Cheers >> Jan >> -- > well I might be missing something about the way couchdb handles > attachments but this doesn't sound good to me. Adding attachments to > hold the revision history means that the attachments have to be > replicated each time a revision happens. Right now, this is true. But with attachment level incremental replication then only attachments that have changed will replicate. > Also a replication conflict is pretty much the same thing as a > revision, a client application would have no knowledge of a > replication conflict happening but this would be good to see in a > wiki-like page history. I can imagine in a distributed system it > would be very hard for the clients to maintain a revision history as > attachments. I disagree about the difficulty. It's surprisingly simple conceptually. The first thing is, every time you update the document, simply attach the previous revision when you save. Eventually there will be a flag you can pass in to do this automatically. Then, if there is a replication conflict to resolve, simply open the two conflicting documents (manually if necessary), update your chosen winner with any info you want to preserve from the loser (data, revision histories, etc) , then delete the loser revision. And that's it. The thing about this system is you can get very simple or very complicated with the revision history aspects, it's up to the application developer. The nice thing is you generally don't need to worry about concurrent or distributed updates with other nodes attempting the same thing. The same rules still apply and eventually the conflicts will be resolved. > As for writing views to not pick up old revisions, I think all > applications should assume that all documents are at all times > carrying a bundle of prior versions and replication/save conflicts. > One of the nasty things in Notes is that most applications assume > that replication conflicts don't happen and can break when they do > happen. I think a major feature of Couchdb is sensible handling of > revisions and conflicts. Purging revisions and conflicts is going to > be necessary for some applications, but in others it is desirable to > retain all versions. It would be good at least to be able to specify > which databases to run compaction on and which to exclude. The scheduling of compaction is something that will be external to the core database code. Much of the work here isn't in the actually file level compaction code, but in creating tools to monitor things and initiate it with desired options. > > What is the proposed rule for compaction? Just deleting all > revisions it finds? Deleting old revisions over a certain age? For the first cut of compaction, it will unconditionally purge all previous revisions of a document from a database, leaving only the most recent revisions of the winner and it's conflicts. Then we will provide a way to perform selective purging during compaction, probably with a user provided function will be fed each document at compaction time, and it will return true or false if the document should be kept or discarded. This is also how deletion "stubs" will be purged as well (keeping some meta info about deleted documents is necessary for replication). > > Another thought, it would be nice perhaps to run compaction on some > servers but not on others for replicas of the same database. Thus a > bunch of offline clients could compact fairly frequently and > aggressively, however a central server they all replicate with that > has lots of disk space could retain all versions. Ok, that's a neat use case but I'm not sure how you would handle the intermediate edits replicating back to the server. Maybe they just get lost. It seems possible to support such a thing without a lot of work. We'll see what is possible. > I am thinking in particular of the scenario of OLPC XO laptops > replicating with a school server. > > > Alan.