couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Bolodurin <alexander.bolodu...@gmail.com>
Subject Re: Resolving replication conflicts for deleted documents in CouchDB
Date Thu, 25 Oct 2012 13:47:46 GMT
Thanks Robert,

I understand the mechanics, but it doesn't quite solve my problem yet.

In your example it's clear: one replica edits foo, another one deletes foo, so both will see
a live and a _deleted revisions.
But it's not the only case. If I happened to resolve a regular edit conflict and delete one
revision, the result is identical (as it should be).
Except in the second case I shouldn't delete the live revision, because it has been introduced
as a result of conflict resolution, the user hasn't deleted anything.

As far as I can tell, there is no way to tell the "origin" of a deleted revision, at least
this way.

Example: https://gist.github.com/3952603

On 25/10/2012, at 11:17 PM, Robert Newson wrote:

> A deletion is just an update. The algorithm that CouchDB uses to
> choose one leaf out of many deliberately chooses _deleted:false over
> _deleted:true.
> 
> Here's a test run I just performed on couchdb/master;
> 
> # setup instance #1
> curl localhost:5984/alex -XPUT
> {"ok":true}
> 
> curl localhost:5984/alex/foo -XPUT -d{}
> {"ok":true,"id":"foo","rev":"1-967a00dff5e02add41819138abb3284d"}
> 
> # setup identical instance #2
> curl localhost:5984/alex2 -XPUT
> {"ok":true}
> 
> curl localhost:5984/alex2/foo -XPUT -d{}
> {"ok":true,"id":"foo","rev":"1-967a00dff5e02add41819138abb3284d"}
> 
> # update doc in instance #1
> curl localhost:5984/alex2/foo -XPUT -d
> '{"_rev:"1-967a00dff5e02add41819138abb3284d"}'
> 
> # delete doc in instance #2
> curl localhost:5984/alex2/foo?rev=1-967a00dff5e02add41819138abb3284d  -XDELETE
> 
> curl localhost:5984/_replicate -Hcontent-type:application/json -d
> '{"source":"alex2","target":"alex"}'
> {"ok":true,"session_id":"ed33d539fe675ac22b76c0a7be3fe1bf","source_last_seq":2,"replication_id_version":3,"history":[{"session_id":"ed33d539fe675ac22b76c0a7be3fe1bf","start_time":"Thu,
> 25 Oct 2012 12:10:54 GMT","end_time":"Thu, 25 Oct 2012 12:10:54
> GMT","start_last_seq":0,"end_last_seq":2,"recorded_seq":2,"missing_checked":1,"missing_found":1,"docs_read":1,"docs_written":1,"doc_write_failures":0}]}
> 
> curl localhost:5984/alex/foo
> {"_id":"foo","_rev":"2-7051cbe5c8faecd085a3fa619e6e6337"}
> 
> curl 'localhost:5984/alex/foo?open_revs=all'
> --2b1fcadf47010c46a3afa22b7533dd07
> Content-Type: application/json
> 
> {"_id":"foo","_rev":"2-7051cbe5c8faecd085a3fa619e6e6337"}
> --2b1fcadf47010c46a3afa22b7533dd07
> Content-Type: application/json
> 
> {"_id":"foo","_rev":"2-eec205a9d413992850a6e32678485900","_deleted":true}
> --2b1fcadf47010c46a3afa22b7533dd07--%
> 
> As you can see, the first database, alex, will show the non-deleted
> doc as per our algorithm, but the doc has two leaf revisions now. To
> resolve in the direction you want, delete the
> 2-7051cbe5c8faecd085a3fa619e6e6337 revision;
> 
> curl localhost:5984/alex/foo?rev=2-7051cbe5c8faecd085a3fa619e6e6337 -XDELETE
> {"ok":true,"id":"foo","rev":"3-7379b9e515b161226c6559d90c4dc49f"}
> 
> curl 'localhost:5984/alex/foo'
> {"error":"not_found","reason":"deleted"}
> 
> B.
> 
> On 25 October 2012 01:29, Alexander Bolodurin
> <alexander.bolodurin@gmail.com> wrote:
>> Hi,
>> 
>> (I have asked this at StackOverflow, but, unsurprisingly, the question didn't get
much attention.)
>> 
>> I'm designing replication conflict handling for a system, and one of its assumptions
is that deletion always takes precedence when resolving conflicts: a deleted documents stays
deleted regardless of what edits it conflicts with, IDs are not reused.
>> 
>> The "official" way of resolving replication conflicts (read conflicting revisions,
merge in the application code, delete unwanted revisions) is not applicable to deleted documents.
If a document is edited on instance 1, and deleted on instance 2, after replication both instances
get the revision from 1. Because only one leaf revision is alive, the document ends up "undeleted",
and without conflicts. The other revision ends up in _deleted_conflicts field, instead of
_conflicts, but I can't use _deleted_conflicts as a cue that a document was deleted, because
it includes deleted revisions from resolving edit conflicts and documents that were deleted
and then re-added, so it's too general and conflates several cases.
>> 
>> How can I get around this at the CouchDB level? Moving it up the application layer
gets really hairy really quickly as now I have to have my custom "deleted" flag, rewrite my
views, test more code and have extra batch jobs to clean up records marked for delete.
>> 
>> Regards,
>> Alex.
> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message