Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id ABF83DEF6 for ; Thu, 25 Oct 2012 14:29:30 +0000 (UTC) Received: (qmail 24975 invoked by uid 500); 25 Oct 2012 14:29:29 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 24917 invoked by uid 500); 25 Oct 2012 14:29:29 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 24887 invoked by uid 99); 25 Oct 2012 14:29:28 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 Oct 2012 14:29:28 +0000 Received: from localhost (HELO mail-vb0-f52.google.com) (127.0.0.1) (smtp-auth username rnewson, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 Oct 2012 14:29:27 +0000 Received: by mail-vb0-f52.google.com with SMTP id k17so1780450vbj.11 for ; Thu, 25 Oct 2012 07:29:26 -0700 (PDT) MIME-Version: 1.0 Received: by 10.52.23.225 with SMTP id p1mr26220512vdf.79.1351175366742; Thu, 25 Oct 2012 07:29:26 -0700 (PDT) Received: by 10.52.21.17 with HTTP; Thu, 25 Oct 2012 07:29:26 -0700 (PDT) In-Reply-To: <8213F582-E95A-483A-993C-06292EDC78E3@gmail.com> References: <6CFCAD66-F77F-4F12-8D15-A9124F82CAFF@gmail.com> <8213F582-E95A-483A-993C-06292EDC78E3@gmail.com> Date: Thu, 25 Oct 2012 15:29:26 +0100 Message-ID: Subject: Re: Resolving replication conflicts for deleted documents in CouchDB From: Robert Newson To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi, Thanks for clarifying. I don't think you can achieve your desired result at a lower level than your proposal to use your own deleted flag (and account for that in views, etc). Does it help at all that a deleted document can contain any set of properties you like? The DELETE method translates internally to a PUT {_id:id, _rev:new_rev, _deleted:true}. You can delete a document by adding _deleted:true and keep any properties you like in there. Btw, I stopped populating StackOverflow with answers when they started abusing their contact database. B. On 25 October 2012 14:47, Alexander Bolodurin wrote: > Thanks Robert, > > I understand the mechanics, but it doesn't quite solve my problem yet. > > In your example it's clear: one replica edits foo, another one deletes fo= o, so both will see a live and a _deleted revisions. > But it's not the only case. If I happened to resolve a regular edit confl= ict and delete one revision, the result is identical (as it should be). > Except in the second case I shouldn't delete the live revision, because i= t has been introduced as a result of conflict resolution, the user hasn't d= eleted anything. > > As far as I can tell, there is no way to tell the "origin" of a deleted r= evision, at least this way. > > Example: https://gist.github.com/3952603 > > On 25/10/2012, at 11:17 PM, Robert Newson wrote: > >> A deletion is just an update. The algorithm that CouchDB uses to >> choose one leaf out of many deliberately chooses _deleted:false over >> _deleted:true. >> >> Here's a test run I just performed on couchdb/master; >> >> # setup instance #1 >> curl localhost:5984/alex -XPUT >> {"ok":true} >> >> curl localhost:5984/alex/foo -XPUT -d{} >> {"ok":true,"id":"foo","rev":"1-967a00dff5e02add41819138abb3284d"} >> >> # setup identical instance #2 >> curl localhost:5984/alex2 -XPUT >> {"ok":true} >> >> curl localhost:5984/alex2/foo -XPUT -d{} >> {"ok":true,"id":"foo","rev":"1-967a00dff5e02add41819138abb3284d"} >> >> # update doc in instance #1 >> curl localhost:5984/alex2/foo -XPUT -d >> '{"_rev:"1-967a00dff5e02add41819138abb3284d"}' >> >> # delete doc in instance #2 >> curl localhost:5984/alex2/foo?rev=3D1-967a00dff5e02add41819138abb3284d = -XDELETE >> >> curl localhost:5984/_replicate -Hcontent-type:application/json -d >> '{"source":"alex2","target":"alex"}' >> {"ok":true,"session_id":"ed33d539fe675ac22b76c0a7be3fe1bf","source_last_= seq":2,"replication_id_version":3,"history":[{"session_id":"ed33d539fe675ac= 22b76c0a7be3fe1bf","start_time":"Thu, >> 25 Oct 2012 12:10:54 GMT","end_time":"Thu, 25 Oct 2012 12:10:54 >> GMT","start_last_seq":0,"end_last_seq":2,"recorded_seq":2,"missing_check= ed":1,"missing_found":1,"docs_read":1,"docs_written":1,"doc_write_failures"= :0}]} >> >> curl localhost:5984/alex/foo >> {"_id":"foo","_rev":"2-7051cbe5c8faecd085a3fa619e6e6337"} >> >> curl 'localhost:5984/alex/foo?open_revs=3Dall' >> --2b1fcadf47010c46a3afa22b7533dd07 >> Content-Type: application/json >> >> {"_id":"foo","_rev":"2-7051cbe5c8faecd085a3fa619e6e6337"} >> --2b1fcadf47010c46a3afa22b7533dd07 >> Content-Type: application/json >> >> {"_id":"foo","_rev":"2-eec205a9d413992850a6e32678485900","_deleted":true= } >> --2b1fcadf47010c46a3afa22b7533dd07--% >> >> As you can see, the first database, alex, will show the non-deleted >> doc as per our algorithm, but the doc has two leaf revisions now. To >> resolve in the direction you want, delete the >> 2-7051cbe5c8faecd085a3fa619e6e6337 revision; >> >> curl localhost:5984/alex/foo?rev=3D2-7051cbe5c8faecd085a3fa619e6e6337 -X= DELETE >> {"ok":true,"id":"foo","rev":"3-7379b9e515b161226c6559d90c4dc49f"} >> >> curl 'localhost:5984/alex/foo' >> {"error":"not_found","reason":"deleted"} >> >> B. >> >> On 25 October 2012 01:29, Alexander Bolodurin >> wrote: >>> Hi, >>> >>> (I have asked this at StackOverflow, but, unsurprisingly, the question = didn't get much attention.) >>> >>> I'm designing replication conflict handling for a system, and one of it= s assumptions is that deletion always takes precedence when resolving confl= icts: a deleted documents stays deleted regardless of what edits it conflic= ts with, IDs are not reused. >>> >>> The "official" way of resolving replication conflicts (read conflicting= revisions, merge in the application code, delete unwanted revisions) is no= t applicable to deleted documents. If a document is edited on instance 1, a= nd deleted on instance 2, after replication both instances get the revision= from 1. Because only one leaf revision is alive, the document ends up "und= eleted", and without conflicts. The other revision ends up in _deleted_conf= licts field, instead of _conflicts, but I can't use _deleted_conflicts as a= cue that a document was deleted, because it includes deleted revisions fro= m resolving edit conflicts and documents that were deleted and then re-adde= d, so it's too general and conflates several cases. >>> >>> How can I get around this at the CouchDB level? Moving it up the applic= ation layer gets really hairy really quickly as now I have to have my custo= m "deleted" flag, rewrite my views, test more code and have extra batch job= s to clean up records marked for delete. >>> >>> Regards, >>> Alex. >> >