couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <>
Subject Re: Newbie questions
Date Wed, 24 Sep 2008 08:04:46 GMT

On Sep 24, 2008, at 9:53 , Ayende Rahien wrote:

> On Wed, Sep 24, 2008 at 10:46 AM, Jan Lehnardt <> wrote:
>> Anyway, I had a few questions that I hope I'll be able to get some  
>> answers
>>> for.
>>> merge conflicts - how does couch db decides on "best" revision?
>> It arbitrarily choses one revision. The only guarantee that is made  
>> is that
>> for
>> the same conflict all nodes in a CouchDB cluster choose the same  
>> latest
>> revision to ensure data consistency.
> How do you ensure that across a cluster, all nodes will select the  
> same
> version?
> Assume that I have the following sequence of events:
> - create doc A (v1)
> - update doc A from V1 (v2)
> - update doc A from v1 (v3) - conflict
> - update doc A from v1 on separate machine (v?) - conflict
> How does it get resolved?

There are two types of conflicts here. update conflicts and  
replication conflicts.

You cannot update doc A from V1 to V3.

- server 1: create doc A(V1)
- replicate server 1 and server 2
(doc A now lives on server 1 and server 2 with V1)
- server 1: update doc A(V1) to doc A(V2a)
- server 2: update doc A(V1) to doc a(V2b)
(now there are two V2 for doc A). No problem so far)
- replicate server 1 and server 2:
  - CouchDB sees that V2a and V2b are different and decides
    either one to be the latest revision. Say V2a gets chosen.
  - Server 1 and server 2 now both have doc A (V2a) as the
    latest revision, but doc a is flagged with a _conflict attribute.
  - You need to go in and resolve that by wither approving CouchDB's
    automatic choice or by using a previous revision. There is no  
    and there is no auto-conflict-resolution. Only auto-conflict- 

>> to get from the code so far are:
>>> - How is the data stored? I think that it is a binary tree on  
>>> disk, but I
>>> am
>>> not following how updates to that can be safe to do so with ACID
>>> guarantees.
>> Two questions that are of particular interest to me, and I haven't  
>> been
>> able
>> Writes are serialized. Only one write can happen at a time and it is
>> completely
>> flushed and committed to disk (2 x fsync()) before another write  
>> comes in.
>> Writes
>> are append-only. No data is ever overwritten. This gives us the  
>> buzzcronyms :-)
> Can you speak more on the actual file format? I don't think that I
> understand how you can have append only with binary trees.

I have to refer you to Damien or the source for that one. :-)


View raw message