couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <adam.kocolo...@gmail.com>
Subject Re: couchdb transactions changes
Date Mon, 09 Feb 2009 05:44:20 GMT
On Feb 8, 2009, at 10:29 PM, Antony Blakey wrote:

>
> On 09/02/2009, at 1:07 PM, Damien Katz wrote:
>
>>
>> On Feb 8, 2009, at 9:24 PM, Chris Anderson wrote:
>>
>>> On Sun, Feb 8, 2009 at 5:54 PM, Damien Katz <damien@apache.org>  
>>> wrote:
>>>>
>>>> It's possible to use MVCC for replication. You'll need to create  
>>>> special
>>>> HTTP command to return you all the documents you are interested  
>>>> in a single
>>>> request, and a special replicator that uses that command and  
>>>> loads those
>>>> documents and writes them to the destination.
>
>>> This sound a lot like the notification view Damien's been talking
>>> about, where clients can register to be told about database updates
>>> that match particular functions.
>>>
>>> The main problem I see with MVCC replication is that if it dies in  
>>> the
>>> middle, you might not be able to restart it right where you left  
>>> off.
>>
>> That would be a big problem of replicating huge databases.  
>> Everything must come over in one transaction.
>
> You could still do that incrementally e.g. it wouldn't have to load  
> in a single request. The key is that the replication shows MVCC  
> boundaries i.e. add a marker in the replication stream to indicate  
> when you passed an MVCC commit point. The current model would ignore  
> such markers.- nothing else is required I think. You could even  
> cycle as long as there were new MVCC states, which would give the  
> same 'includes-updates-as-they-come-in' form of replication, but  
> with somewhat more consistency. If these restart points were  
> included in the replication stream, then systems that wanted to  
> allow replication rollback (see below) could reset the rollback MVCC  
> state when they get an end-of-MVCC state marker.

A bit more work is required, I think.  In addition to inserting MVCC  
commit point markers in the replication stream, we'd also have to  
include all the document/rev pairs that were part of the _bulk_docs  
update.  As it stands today, if one of those documents is updated  
again it will only show up at the later update_seq.

This could actually get pretty hairy, now that I think of it.  What  
happens during compaction?  Do we save old revisions of a document if  
the revision was part of a _bulk_docs update?

Adam

Mime
View raw message