couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sanjuan, Hector" <hector.sanj...@here.com>
Subject Re: How to store the delta between doc revisions?
Date Thu, 02 Oct 2014 13:56:50 GMT
All is taken care of in the client side.

I don't store deltas/patch files per se, actually I store full "previous" and "current" versions
of the doc(s). Client should be able to produce a diffs when needed it whatever format required.

You could implement cache mechanisms if in need (memcache-like if you want). I'm my case documents
are fairly small and I am not particularly worried about the delay introduced by an extra
GET.

As you see there is nothing specially clever in my approach and its quite lousy on many aspects.
It does not care much about consistency. i.e. if a PUT succeeds, the subsequent transaction
POST might fail. And with replication enabled two people could edit a document in conflicting
ways and each of them would get a transaction record, even though one of their changes will
get discarded in the conflict resolution. And well, a whole universe of failures can happen
when editing multiple docs for 1 transaction. So this is more of a simple paper trail which
I keep for some months and then delete anyway. If you aim for something fully consistent,
race-condition proof solution, it is going to be difficult (possibly impossible in a multi-master
scenario). Perhaps with update handlers you can reach a compromise solution but Im not sure
how that is going to work on multi-node setups either.

H
________________________________________
From: Eric B <ebenzacar@gmail.com>
Sent: Thursday, October 2, 2014 15:17
To: user@couchdb.apache.org
Subject: Re: How to store the delta between doc revisions?

On Thu, Oct 2, 2014 at 4:16 AM, Sanjuan, Hector <hector.sanjuan@here.com>
wrote:

> I manage this outside Couchdb. I have a separate database for
> "transaction" docs which store things like the date a modification occurred
> and the resources that changed and how (one transaction can account for
> changes for several docs if it happened to be triggered by the same
> operation).
>

Can you elaborate how you do this?  I presume it must all be taken care of
on the client side?  I haven't found anyway to accomplish something like
this via update handlers.

The main objective is to be able to figure out who touched a doc, when, and
> what change was likely introduced (we don't expect to revert/restore old
> revisions too often, although we could).
>

So you only store patches between revs then I presume?  Do you actually use
something to do a true patch file, or just in a key/value pair?  ie:
field1=new value, field2=new value, etc.


> It has an overhead (every write triggers a GET to fetch the last revision)
> and doesn't bother much about race conditions or strict history consistency
> (if you do bother too much about these you lose many advantages of the
> noSQL model), but it is really simple to implement (and there is no need to
> debug code that runs inside couchdb).
>

Have you considered maintaining a local cache to avoid additional gets
everytime?  ie: upon the original get, cache the data and then check the
cache whenever a write is executed.

I have considered this system, but without multi-document transactions,
there is no way to ensure consistency.  (ie: if the document update
succeeds and the history log fails, it is too difficult to roll back the
doc update).  And if only storing deltas, missing a rev would make it
impossible to rebuild the history of any document.  Additionally, there is
no way to effectively use update handlers, for the same reason as above.
The history log would have to be written only upon success of the update
handler, at which time it may or not be a successful write.  Plus, it is
more difficult retrieving the older rev of the doc that was just updated.

Unless I am making things too complicated?

Thanks,

Eric




>
> ________________________________________
> From: Alexander Shorin <kxepal@gmail.com>
> Sent: Wednesday, October 1, 2014 22:23
> To: user@couchdb.apache.org
> Subject: Re: How to store the delta between doc revisions?
>
> That's right: validate_doc_update cannot modify a document to store.
> But it could check if previous version is included into history log
> stored within update document - what is actually your update handled
> doing. So clients have to use your update handlers or implement the
> same logic on their side to by pass validation.
> --
> ,,,^..^,,,
>
>
> On Wed, Oct 1, 2014 at 11:45 PM, Eric Benzacar <eric@benzacar.ca> wrote:
> > As you mentioned, the update_notif_handler and changes feed are things
> that
> > are triggered after a document is persisted, so it can cause race
> > conditions.  Ideally, I'm looking to trigger a handler just before it is
> > persisted.
> >
> > I looked into the validate_doc_update function, but even if I want to
> store
> > the history log within the document (not opposed to it), I can't seem to
> > modify the contents in the validate_doc_update function (which is
> > appropriate).  So I'm still no further ahead in figuring out a central
> > place to do this.
> >
> > So then I am reduced to ensure that every updateHandler I call creates a
> > history log, and posts/put of the document do it as well.  Which means
> that
> > I am putting code in several different places to perform the same task,
> > which is error prone and leads to fragmentation.
> >
> > Unless I am missing something?
> >
> > Thanks,
> >
> > Eric
> >
> > On Wed, Oct 1, 2014 at 3:30 PM, Alexander Shorin <kxepal@gmail.com>
> wrote:
> >
> >> Suddenly no. At least completely. You can create your
> >> validate_doc_update function which will verify that every new stored
> >> contains some specific data (like previous document version to which
> >> validate_doc_update also has access), but all this leads to storing
> >> history log inside single document. If you want to track it
> >> separately: changes feed and update_notification_handler are your
> >> friends, but there could be happened race conditions (especially if
> >> compaction get triggered) so there will be always a chance for you to
> >> miss some revision.
> >> --
> >> ,,,^..^,,,
> >>
> >>
> >> On Wed, Oct 1, 2014 at 11:18 PM, Eric B <ebenzacar@gmail.com> wrote:
> >> > Thanks for the valid points.  But either way (whether through patches
> or
> >> > storing the full previous revision), is there a mechanism in CouchDB
> in
> >> > which I can require all calls to trigger an updateHandler?  In a way,
> I'm
> >> > looking more for an update interceptor; something to be run just
> before a
> >> > document is actually persisted to the DB, but that is always executed.
> >> >
> >> > Thanks,
> >> >
> >> > Eric
> >> >
> >> >
> >> > On Wed, Oct 1, 2014 at 3:03 PM, Alexander Shorin <kxepal@gmail.com>
> >> wrote:
> >> >
> >> >> Storing patches is good until you're in sure that no single patch
> will
> >> >> get suddenly deleted. Otherwise you could easily find all your
> history
> >> >> broken. Oblivious, but it is the thing to remember when picking this
> >> >> way of history management. Storing full document copies per revision
> >> >> is more solid solution for such case: you can easily skip or lose one
> >> >> or several revisions and be fine, but it also consumes much more disk
> >> >> space. Trade offs are everywhere, pick up the one that suites you.
> >> >> --
> >> >> ,,,^..^,,,
> >> >>
> >> >>
> >> >> On Wed, Oct 1, 2014 at 10:02 PM, Eric B <ebenzacar@gmail.com>
wrote:
> >> >> > I'm new to CouchDB and trying to figure out the best way to store
a
> >> >> history
> >> >> > of changes for a document.
> >> >> >
> >> >> > Originally, I was thinking the thing that makes the most sense
is
> to
> >> use
> >> >> > the update function of CouchDB but not entirely sure if I can.
 Is
> >> there
> >> >> > someway to use the update function and modify/create a second
> >> document in
> >> >> > the process?
> >> >> >
> >> >> > For example, if I have a document which contains notes for a
> client.
> >> >> > Everytime I modify the notes document (ie: add new lines or delete
> >> >> lines),
> >> >> > I want to maintain the changes made to it.  If there was a way
to
> use
> >> >> > CouchDB's rev fields for this, my problem would be solved, but
> since
> >> >> > CouchDB deletes non-current revs upon compaction, that is not
an
> >> option.
> >> >> >
> >> >> > So instead, I want to create a "history_log" document, where I
can
> >> just
> >> >> > store the delta between documents (as a patch, for example).
> >> >> >
> >> >> > In order to do this, I need to have my existing document, my new
> >> >> document,
> >> >> > compare the changes and write them to a history_log document.
 But
> I
> >> >> don't
> >> >> > see if/where I can do that within and update handler.
> >> >> >
> >> >> > Is there something that can help me do this easily within CouchDB?
> >> Are
> >> >> > there patch or json compare functions I can have access to from
> >> within a
> >> >> > CouchDB handler?
> >> >> >
> >> >> > Thanks,
> >> >> >
> >> >> > Eric
> >> >>
> >>
>

Mime
View raw message