couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: Is it OK to interpret the generation count in a revision ID?
Date Wed, 13 Jul 2011 00:39:03 GMT
On Tue, Jul 12, 2011 at 7:11 PM, Jens Alfke <jens@couchbase.com> wrote:
>
> On Jul 12, 2011, at 2:54 PM, Paul Davis wrote:
>
> In this particular case, I'm not sure if we even document that the
> revs=true option exists, as I can't think of where it would be used
> outside the replicator so even the format of this response might be
> subject to change.
>
> No, it’s well documented both in the CouchDB wiki and in Couchbase’s API docs:
>
> http://wiki.apache.org/couchdb/HTTP_Document_API#GET
> http://www.couchbase.org/sites/default/files/uploads/all/documentation/couchbase-api-dbdoc.html#couchbase-api-dbdoc_db-doc_get-revs
>
> So is the ‘extended revision history’, ?revs_info=true.
>
> Not sure why the revision history would be viewed as internal, as it’s pretty important
for conflict resolution.
>
> —Jens
>

There are a couple issues at hand here. First, I'm pretty sure all of
our conflict resolution descriptions have focused on only looking at
the leaf nodes of the revision tree. If something even comes near
suggesting otherwise it should be fixed.

Secondly, the _revs_info is probably more what you need and I think
its reasonable to say that we've accidentally committed to having that
as a public API. Though as in all things related to non-leaf revisions
this information should be used wisely. The only recommended use case
I've know of as valid and acceptable is for implementing an "instant
undo" operation (ie, in a few minutes to hours, the undo might no
longer work).

Thirdly, I wouldn't consider this to be well documented. Judging from
both links it looks like someone was reading source code and decided
to make a note that the option existed. Both CouchBase and the CouchDB
wiki docs have a phrase similar to "returns a list of the revisions"
which is much different than the reality of "returns a path of
revisions from the leaf towards the root where the root may not exist
due to stemming. The algorithms involved in combining the list of
paths to form a tree structure is only documented as source code in
couch_key_tree.erl and you should regard this as an internal API.
Thus, use of this option is of very limited value and probably should
not be relied upon until the replicator has been publicly defined as a
standard."


For the curious, I put together a script that sets up a non-linear
revision tree and prints the output for some of the various revision
related options. The important bit I was originally wanting to show
are the differences in _revisions when you specify a rev in the GET.
The others I added on a whim just cause it was easy and illustrates
the different dimensionality of these things (ie, leafs, vs paths).


Output:

conflicts=true
--------------

{'_conflicts': ['5-64d60892a872fef2c31260359d04a554'],
 '_id': 'foo',
 '_rev': '5-ca289aa53cbbf35a5f5c799b64b1f16f',
 'val': 4}

revs_info=true
--------------

{'_id': 'foo',
 '_rev': '5-ca289aa53cbbf35a5f5c799b64b1f16f',
 '_revs_info': [{'rev': '5-ca289aa53cbbf35a5f5c799b64b1f16f',
                 'status': 'available'},
                {'rev': '4-c29100c4e452f57c5fa5d10d3f7cbfa8',
                 'status': 'available'},
                {'rev': '3-99bacb41f223b45a936f8c988a3562dc',
                 'status': 'available'},
                {'rev': '2-29e744a448f741049ac9a46694718b67',
                 'status': 'available'},
                {'rev': '1-c43bcd498c5300a3b5f0f788be936549',
                 'status': 'available'}],
 'val': 4}

open_revs=all
-------------

[{'ok': {'_id': 'foo',
         '_rev': '5-64d60892a872fef2c31260359d04a554',
         'val': 7}},
 {'ok': {'_id': 'foo',
         '_rev': '5-ca289aa53cbbf35a5f5c799b64b1f16f',
         'val': 4}}]

revs=true
---------

{'_id': 'foo',
 '_rev': '5-ca289aa53cbbf35a5f5c799b64b1f16f',
 '_revisions': {'ids': ['ca289aa53cbbf35a5f5c799b64b1f16f',
                        'c29100c4e452f57c5fa5d10d3f7cbfa8',
                        '99bacb41f223b45a936f8c988a3562dc',
                        '29e744a448f741049ac9a46694718b67',
                        'c43bcd498c5300a3b5f0f788be936549'],
                'start': 5},
 'val': 4}

revs=true with rev specified.
-----------------------------

{'_id': 'foo',
 '_rev': '5-64d60892a872fef2c31260359d04a554',
 '_revisions': {'ids': ['64d60892a872fef2c31260359d04a554',
                        'c88c574fba4d0216114afd37deaaef62',
                        '7e7aa04dea7fd0c213518c4e395ff1f4',
                        '29e744a448f741049ac9a46694718b67',
                        'c43bcd498c5300a3b5f0f788be936549'],
                'start': 5},
 'val': 7}


revs.py:

#! /usr/bin/env python

import pprint
import couchdbkit

s = couchdbkit.Server(uri="http://127.0.0.1:5984")

try:
    s.delete_db("revs_test_1")
except:
    pass

try:
    s.delete_db("revs_test_2")
except:
    pass

d1 = s.create_db("revs_test_1")
d2 = s.create_db("revs_test_2")

d1["foo"] = {"val": 0}
doc1 = d1.open_doc("foo")
doc1["val"] = 1
d1.save_doc(doc1)
doc2 = dict(doc1)

s.replicate("revs_test_1", "revs_test_2")

for i in range(3):
    doc1["val"] = i+2
    d1.save_doc(doc1)

    doc2["val"] = i+5
    d2.save_doc(doc2)

s.replicate("revs_test_2", "revs_test_1")


def hdr(d):
    print "\n%s" % d
    print "%s\n" % ("-" * len(d))

hdr("conflicts=true")
pprint.pprint(d1.open_doc("foo", conflicts=True))

hdr("revs_info=true")
pprint.pprint(d1.open_doc("foo", revs_info=True))

hdr("open_revs=all")
pprint.pprint(d1.open_doc("foo", open_revs="all"))

hdr("revs=true")
pprint.pprint(d1.open_doc("foo", revs=True))

hdr("revs=true with rev specified.")
pprint.pprint(d1.open_doc("foo", rev=doc2["_rev"], revs=True))

Mime
View raw message