incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Candler <>
Subject Re: Re: How reliable is the versioning system?
Date Wed, 29 Jul 2009 09:28:09 GMT
On Sun, Jul 26, 2009 at 02:19:23PM -0700, Robert Ames wrote: In addition, I
> see here an explicit recommendation is to maintain revision history
> outside of CouchDB, and it seems as though the replication model is pretty
> similar to Git's model...

The replication model is quite different to Git's - although this would be a
useful comparison to put on the wiki somewhere.

With Git, you have a copy of each peer's tree (remotes/PEER/BRANCH). Then
you perform a merge into a single document in your working tree. If the
merge fails, then it fails; you still have a single working copy, but with
the conflicts explicitly marked within that document. It's up to you to
resolve those conflicts and commit the final version.

With Git, merging is always done when you *pull* from a peer; if you *push*
to a peer which can't be fast-forwarded, the push fails. (Or you can force
the push, but that simply overwrites the changes at the peer)

With CouchDB, you don't keep track of peers in the database. When you
replicate from a peer, or a peer replicates to you, and the documents
conflict(*), then you get multiple copies of the document within the
database. When you request a document by ID, you get an arbitrary one of
this set, unless you explicitly ask for the other versions. However the
multiple copies are all there and are all effectively equal copies (except
for the property that one is arbitrary chosen as the "winner"); no
precedence is given to the version which originated locally, for instance.

(*) In this case, 'conflict' effectively means 'derived from a different
_rev'. There is no attempt in CouchDB to perform any merging.

A CouchDB-like system running on top of git would be extremely interesting.
I can see four main parts:

- a high-performance git backend which appends objects to a single file.
  The compact operation creates a new file and rotates it into place, and
  ideally retains git's ability to create packs and compact using diffs.

- a new 'btree' git object class, for mapping an unlimited number of keys
  to objects. refs would be stored using this. The existing flat 'tree'
  object won't scale well to millions of keys.

- a HTTP interface for storing and retrieving objects by key, using the
  'btree' class again (*)

- the map/reduce engine ported to run on top of this, following the commit
  tree to determine which documents had changed.

There would need to be some way to handle merging and conflicts. Perhaps the
HTTP interface returns all peer versions as a multipart, with their commit
IDs as the rev.

It also needs to be decided how to handle 'attachments'. Personally I'd like
to store the MIME type against every document, which means you could store
non-JSON objects as first-class objects in their own right. Then you could
do things like using map/reduce to scale JPEGs.

(*) Note: you might think of having one branch per document, instead of a
top-level btree object containing all documents. That would give each object
an independent history. The trouble with that approach is that git
replication doesn't work well with thousands of branches - I've tried it :-)
- because it has to iterate linearly over each branch.

So I think that you really need one btree object for the database, and then
store the history of that via commits.



View raw message