couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Damien Katz <>
Subject Re: Relying on revisions for rollbacks
Date Mon, 17 Mar 2008 19:52:46 GMT
On Mar 17, 2008, at 2:48 PM, Alan Bell wrote:

> Jan Lehnardt wrote:
>> You can do that, too. With attachments, you'd have it all in one
>> place and would not need to write your views in a way that they
>> don't pick up old revisions. That said, it is certainly possible to
>> store older revisions in other documents, if that solves your
>> problems.
>> Cheers
>> Jan
>> --  
> well I might be missing something about the way couchdb handles  
> attachments but this doesn't sound good to me. Adding attachments to  
> hold the revision history means that the attachments have to be  
> replicated each time a revision happens.

Right now, this is true. But with attachment level incremental  
replication then only attachments that have changed will replicate.

> Also a replication conflict is pretty much the same thing as a  
> revision, a client application would have no knowledge of a  
> replication conflict happening but this would be good to see in a  
> wiki-like page history. I can imagine in a distributed system it  
> would be very hard for the clients to maintain a revision history as  
> attachments.

I disagree about the difficulty. It's surprisingly simple conceptually.

The first thing is, every time you update the document, simply attach  
the previous revision when you save. Eventually there will be a flag  
you can pass in to do this automatically.

Then, if there is a replication conflict to resolve, simply open the  
two conflicting documents (manually if necessary), update your chosen  
winner with any info you want to preserve from the loser (data,  
revision histories, etc) , then delete the loser revision.

And that's it. The thing about this system is you can get very simple  
or very complicated with the revision history aspects, it's up to the  
application developer. The nice thing is you generally don't need to  
worry about concurrent or distributed updates with other nodes  
attempting the same thing. The same rules still apply and eventually  
the conflicts will be resolved.

> As for writing views to not pick up old revisions, I think all  
> applications should assume that all documents are at all times  
> carrying a bundle of prior versions and replication/save conflicts.  
> One of the nasty things in Notes is that most applications assume  
> that replication conflicts don't happen and can break when they do  
> happen. I think a major feature of Couchdb is sensible handling of  
> revisions and conflicts. Purging revisions and conflicts is going to  
> be necessary for some applications, but in others it is desirable to  
> retain all versions. It would be good at least to be able to specify  
> which databases to run compaction on and which to exclude.

The scheduling of compaction is something that will be external to the  
core database code. Much of the work here isn't in the actually file  
level compaction code, but in creating tools to monitor things and  
initiate it with desired options.

> What is the proposed rule for compaction? Just deleting all  
> revisions it finds? Deleting old revisions over a certain age?

For the first cut of compaction, it will unconditionally purge all  
previous revisions of a document from a database, leaving only the  
most recent revisions of the winner and it's conflicts.

Then we will provide a way to perform selective purging during  
compaction, probably with a user provided function will be fed each  
document at compaction time, and it will return true or false if the  
document should be kept or discarded. This is also how deletion  
"stubs" will be purged as well (keeping some meta info about deleted  
documents is necessary for replication).

> Another thought, it would be nice perhaps to run compaction on some  
> servers but not on others for replicas of the same database. Thus a  
> bunch of offline clients could compact fairly frequently and  
> aggressively, however a central server they all replicate with that  
> has lots of disk space could retain all versions.

Ok, that's a neat use case but I'm not sure how you would handle the  
intermediate edits replicating back to the server. Maybe they just get  
lost. It seems possible to support such a thing without a lot of work.  
We'll see what is possible.

> I am thinking in particular of the scenario of OLPC XO laptops  
> replicating with a school server.

> Alan.

View raw message