incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <rnew...@apache.org>
Subject Re: Resolving replication conflicts for deleted documents in CouchDB
Date Thu, 25 Oct 2012 14:29:26 GMT
Hi,

Thanks for clarifying. I don't think you can achieve your desired
result at a lower level than your proposal to use your own deleted
flag (and account for that in views, etc). Does it help at all that a
deleted document can contain any set of properties you like? The
DELETE method translates internally to a PUT {_id:id, _rev:new_rev,
_deleted:true}. You can delete a document by adding _deleted:true and
keep any properties you like in there.

Btw, I stopped populating StackOverflow with answers when they started
abusing their contact database.

B.

On 25 October 2012 14:47, Alexander Bolodurin
<alexander.bolodurin@gmail.com> wrote:
> Thanks Robert,
>
> I understand the mechanics, but it doesn't quite solve my problem yet.
>
> In your example it's clear: one replica edits foo, another one deletes foo, so both will
see a live and a _deleted revisions.
> But it's not the only case. If I happened to resolve a regular edit conflict and delete
one revision, the result is identical (as it should be).
> Except in the second case I shouldn't delete the live revision, because it has been introduced
as a result of conflict resolution, the user hasn't deleted anything.
>
> As far as I can tell, there is no way to tell the "origin" of a deleted revision, at
least this way.
>
> Example: https://gist.github.com/3952603
>
> On 25/10/2012, at 11:17 PM, Robert Newson wrote:
>
>> A deletion is just an update. The algorithm that CouchDB uses to
>> choose one leaf out of many deliberately chooses _deleted:false over
>> _deleted:true.
>>
>> Here's a test run I just performed on couchdb/master;
>>
>> # setup instance #1
>> curl localhost:5984/alex -XPUT
>> {"ok":true}
>>
>> curl localhost:5984/alex/foo -XPUT -d{}
>> {"ok":true,"id":"foo","rev":"1-967a00dff5e02add41819138abb3284d"}
>>
>> # setup identical instance #2
>> curl localhost:5984/alex2 -XPUT
>> {"ok":true}
>>
>> curl localhost:5984/alex2/foo -XPUT -d{}
>> {"ok":true,"id":"foo","rev":"1-967a00dff5e02add41819138abb3284d"}
>>
>> # update doc in instance #1
>> curl localhost:5984/alex2/foo -XPUT -d
>> '{"_rev:"1-967a00dff5e02add41819138abb3284d"}'
>>
>> # delete doc in instance #2
>> curl localhost:5984/alex2/foo?rev=1-967a00dff5e02add41819138abb3284d  -XDELETE
>>
>> curl localhost:5984/_replicate -Hcontent-type:application/json -d
>> '{"source":"alex2","target":"alex"}'
>> {"ok":true,"session_id":"ed33d539fe675ac22b76c0a7be3fe1bf","source_last_seq":2,"replication_id_version":3,"history":[{"session_id":"ed33d539fe675ac22b76c0a7be3fe1bf","start_time":"Thu,
>> 25 Oct 2012 12:10:54 GMT","end_time":"Thu, 25 Oct 2012 12:10:54
>> GMT","start_last_seq":0,"end_last_seq":2,"recorded_seq":2,"missing_checked":1,"missing_found":1,"docs_read":1,"docs_written":1,"doc_write_failures":0}]}
>>
>> curl localhost:5984/alex/foo
>> {"_id":"foo","_rev":"2-7051cbe5c8faecd085a3fa619e6e6337"}
>>
>> curl 'localhost:5984/alex/foo?open_revs=all'
>> --2b1fcadf47010c46a3afa22b7533dd07
>> Content-Type: application/json
>>
>> {"_id":"foo","_rev":"2-7051cbe5c8faecd085a3fa619e6e6337"}
>> --2b1fcadf47010c46a3afa22b7533dd07
>> Content-Type: application/json
>>
>> {"_id":"foo","_rev":"2-eec205a9d413992850a6e32678485900","_deleted":true}
>> --2b1fcadf47010c46a3afa22b7533dd07--%
>>
>> As you can see, the first database, alex, will show the non-deleted
>> doc as per our algorithm, but the doc has two leaf revisions now. To
>> resolve in the direction you want, delete the
>> 2-7051cbe5c8faecd085a3fa619e6e6337 revision;
>>
>> curl localhost:5984/alex/foo?rev=2-7051cbe5c8faecd085a3fa619e6e6337 -XDELETE
>> {"ok":true,"id":"foo","rev":"3-7379b9e515b161226c6559d90c4dc49f"}
>>
>> curl 'localhost:5984/alex/foo'
>> {"error":"not_found","reason":"deleted"}
>>
>> B.
>>
>> On 25 October 2012 01:29, Alexander Bolodurin
>> <alexander.bolodurin@gmail.com> wrote:
>>> Hi,
>>>
>>> (I have asked this at StackOverflow, but, unsurprisingly, the question didn't
get much attention.)
>>>
>>> I'm designing replication conflict handling for a system, and one of its assumptions
is that deletion always takes precedence when resolving conflicts: a deleted documents stays
deleted regardless of what edits it conflicts with, IDs are not reused.
>>>
>>> The "official" way of resolving replication conflicts (read conflicting revisions,
merge in the application code, delete unwanted revisions) is not applicable to deleted documents.
If a document is edited on instance 1, and deleted on instance 2, after replication both instances
get the revision from 1. Because only one leaf revision is alive, the document ends up "undeleted",
and without conflicts. The other revision ends up in _deleted_conflicts field, instead of
_conflicts, but I can't use _deleted_conflicts as a cue that a document was deleted, because
it includes deleted revisions from resolving edit conflicts and documents that were deleted
and then re-added, so it's too general and conflates several cases.
>>>
>>> How can I get around this at the CouchDB level? Moving it up the application
layer gets really hairy really quickly as now I have to have my custom "deleted" flag, rewrite
my views, test more code and have extra batch jobs to clean up records marked for delete.
>>>
>>> Regards,
>>> Alex.
>>
>

Mime
View raw message