couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Anderson <jch...@apache.org>
Subject Re: struggling with couchdb in production
Date Thu, 28 May 2009 15:16:12 GMT
On Thu, May 28, 2009 at 8:14 AM, Chris Anderson <jchris@apache.org> wrote:
> On Thu, May 28, 2009 at 7:33 AM, Damien Katz <damien@apache.org> wrote:
>>
>> On May 28, 2009, at 10:19 AM, Brian Candler wrote:
>>
>>>> You can use [open_]revs=all to open all the conflicts (deleted conflicts
>>>> too)
>>>
>>> Ah, open_revs=all is new to me - it works fine, although knowing about
>>> deleted revisions isn't of particular interest. What I want is all live
>>> (current) conflicting versions.
>>>
>>> It seems to me that this is something that Amazon Dynamo got right:
>>>
>>> * A GET gives you all "live" versions of a document, plus an opaque
>>>  context
>>>
>>> * A PUT of an updated document (which includes this context object)
>>>  replaces the corresponding set of old versions with this one
>>>
>>> * A PUT never fails, but may introduce conflicting versions
>>>
>>> This is both simple and powerful, and dealing with conflicts would then
>>> become pretty pretty easy. As a side benefit: you would no longer need an
>>> API to fetch an item by _rev, which would make it less likely that people
>>> would confuse CouchDB with an RCS :-)
>>>
>>> There is only one reason I can see that CouchDB picks a "preferred"
>>> version
>>> from amongst the conflicts, and that is for the benefit of views. However,
>>> even that problem goes away if you just pass *all* versions of a document
>>> to
>>> the map function.
>>>
>>>  function(docs) { ... }
>>>
>>> The map function may then choose to:
>>> - emit keys corresponding to docs[0] only (= current behaviour)
>>> - emit keys corresponding to all docs
>>> - perform some application-specific view merging
>>>
>>> As long as the conflicting versions of the doc are returned in a
>>> deterministic order, then both clients and views *could* choose to work in
>>> the current way (by just picking the first version and ignoring the
>>> others),
>>> but they would be encouraged to highlight and/or resolve the conflicts at
>>> the earliest opportunity.
>>>
>>>> Also, bulk document retrieval via POST where the post body specifies
>>>> the docs and revisions is something we'd like to see added to the
>>>> front end too.
>>>
>>> I think this is adding more complexity to the API. When would you really
>>> want to get a specific rev or set of revs, rather than *all* live
>>> conflicting revs?
>>>
>>>> Patches welcome, I and others in community will be glad to help you.
>>>
>>> I suspect that what I'm suggesting is too radical to stand much chance of
>>> being merged :-( The path of least resistance, for now, is to avoid all
>>> replication other than master->slave.
>>>
>>
>> Doesn't sound too radical at all. I'd like to see how well it works in
>> practice.
>>
>
> I agree with Damien here. What you're suggesting doesn't sound like a
> departure from the way CouchDB operates, just a variation or
> refinement. We try to treat conflicts as a normal state. Currently we
> also make it easy to ignore conflicts if your application is naive.
> These changes would make it harder to be naive, which raises the bar
> for entry but also ensures applications are capable of handling multi
> master replication.
>

I just realized I should add that the "standard" way of dealing with
replication conflicts so far has been to create a view which lists all
conflict docs and setup a background process to query it and resolve
any conflicts it finds.

Chris

-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Mime
View raw message