couchdb-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Couchdb Wiki] Update of "Replication_and_conflicts" by BrianCandler
Date Sun, 01 Nov 2009 10:50:19 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Couchdb Wiki" for change notification.

The "Replication_and_conflicts" page has been changed by BrianCandler.
The comment on this change is: Describe the revision tree and open_revs=all.
http://wiki.apache.org/couchdb/Replication_and_conflicts?action=diff&rev1=3&rev2=4

--------------------------------------------------

  }
  }}}
  
+ == Revision tree ==
+ 
+ When you update a document in couchdb, it keeps a list of the previous revisions.
+ In the case where conflicting updates are introduced, this history branches into a
+ tree, where the current conflicting revisions for this document form the tips
+ (leaf nodes) of this tree.
+ 
+ {{{
+       ,--> r2a
+     r1 --> r2b
+       `--> r2c
+ }}}
+ 
+ Each branch can then extend its history - for example if you read
+ revision r2b and then PUT with `?rev=r2b` then you will make a new revision
+ along that particular branch.
+ 
+ {{{
+       ,--> r2a -> r3a -> r4a
+     r1 --> r2b -> r3b
+       `--> r2c -> r3c
+ }}}
+ 
+ Here, (r4a, r3b, r3c) are the set of conflicting revisions. The way you
+ resolve a conflict is to delete the leaf nodes along the other branches.
+ So when you combine (r4a+r3b+r3c) into a single merged document, you
+ would replace r4a and delete r3b and r3c.
+ 
+ {{{
+       ,--> r2a -> r3a -> r4a -> r5a
+     r1 --> r2b -> r3b -> (r4b deleted)
+       `--> r2c -> r3c -> (r4c deleted)
+ }}}
+ 
+ Note that r4b and r4c still exist as leaf nodes in the history tree, but as
+ deleted docs. You can retrieve them but they will be marked `"_deleted":true`.
+ 
+ When you compact a database, the bodies of all the non-leaf documents are
+ discarded. However, the list of historical _revs is retained, for the benefit of
+ later conflict resolution in case you meet any old replicas of the database at
+ some time in future. There is "revision pruning" to stop this getting arbitrarily large.
+ 
  = Working with conflicting documents =
  
  == HTTP API ==
@@ -175, +217 @@

  The basic `GET /db/bob` operation will not show you any information about
  conflicts. You see only the deterministically-chosen winner, and get no
  indication as to whether other conflicting revisions exist or not.
+ 
+ {{{
+ {"_id":"test","_rev":"2-b91bb807b4685080c6a651115ff558f5","hello":"bar"}
+ }}}
  
  If you do `GET /db/bob?conflicts=true`, and the document is in a conflict
  state, then you will get the winner plus a _conflicts member containing an
  array of the revs of the other, conflicting revision(s). You can then fetch
  them individually using subsequent `GET /db/bob?rev=xxxx` operations.
  
- As far as I can tell, from the list of _conflicts you cannot fetch all those
- versions in one go with a multi-document fetch (_all_docs).  They have to be
- individual GETs.
+ {{{
+ {"_id":"test","_rev":"2-b91bb807b4685080c6a651115ff558f5","hello":"bar",
+ "_conflicts":["2-65db2a11b5172bf928e3bcf59f728970","2-5bc3c6319edf62d4c624277fdd0ae191"]}
+ }}}
  
- Your application can then choose to display them all to the user. Or it
- could attempt to merge them, write back the merged version, and delete the
- conflicting versions - that is, "resolve" the conflict.
+ If you do `GET /db/bob?open_revs=all` then you will get all the leaf nodes
+ of the revision tree. This ''will'' give you all the current conflicts, but will
+ also give you leaf nodes which have been deleted (i.e. parts of the conflict
+ history which have since been resolved). You can remove these by filtering
+ out documents with `"_deleted":true`.
  
- If you are merging multiple conflicts into a single version, you need to
- delete all the conflicting revisions explicitly.  However a single
- _bulk_docs update can write the new version and simultaneously delete all
- the other ones.
+ {{{
+ [{"ok":{"_id":"test","_rev":"2-5bc3c6319edf62d4c624277fdd0ae191","hello":"foo"}},
+ {"ok":{"_id":"test","_rev":"2-65db2a11b5172bf928e3bcf59f728970","hello":"baz"}},
+ {"ok":{"_id":"test","_rev":"2-b91bb807b4685080c6a651115ff558f5","hello":"bar"}}]
+ }}}
+ 
+ The "ok" tag is an artefact of open_revs, which also lets you list explicit
+ revisions as a JSON array, e.g. `open_revs=[rev1,rev2,rev3]`. In this form,
+ it would be possible to request a revision which is now missing, because the
+ database has been compacted.
+ 
+ (NOTE: it's not clear if the ordering is related to the deterministic choice
+ of the "winning" revision. In the above example, the "winning" revision is 2-b91b...
+ which is returned last, but I don't know if this is guaranteed to be true always)
+ 
+ Once you have retrieved all the conflicting revisions, your application can then
+ choose to display them all to the user. Or it could attempt to merge them, write
+ back the merged version, and delete the conflicting versions - that is, to resolve
+ the conflict permanently.
+ 
+ As described above, you need to update one revision and delete all the conflicting
+ revisions explicitly. This can be done using a single POST to _bulk_docs, setting
+ `"_deleted":true` on those revisions you wish to delete.
  
  == View API ==
  
@@ -513, +581 @@

  branch.  This is different to couchdb, which doesn't keep any peer state in
  the database.
  
+ Another difference with git is that it maintains all history back to time zero -
+ git compaction keeps diffs between all those versions in order to reduce size,
+ but couchdb discards them. If you are constantly updating a document, the size
+ of a git repo would grow forever. It is possible (with some effort) to use
+ "history rewriting" to make git forget commits earlier than a particular one.
+ 
  == Amazon Dynamo ==
  
  [[http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html|Dynamo]]

Mime
View raw message