incubator-couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Anderson <>
Subject Re: History Proposal
Date Sun, 02 Aug 2009 19:29:23 GMT
On Sat, Aug 1, 2009 at 3:29 AM, Jason Davies<> wrote:
> On 31 Jul 2009, at 14:42, Benoit Chesneau wrote:
>> 2009/7/31 Jason Davies <>:
>>> The main points of this proposal are:
>>> 1. Store the historical versions of documents in a separate database.
>>>  This
>>> is for a number of reasons: a) keeping it separate means we don't clog up
>>> the main database with historical data b) history-specific views can be
>>> kept
>>> here c) non-intrusive implementation of this is easier.
>>> 2. The change will be made at the couch_db layer so that *any* change to
>>> any
>>> document in the target database will be mirrored to the history database.
>> seem good.
>>> 3. Each and every change to a document will result in a new document
>>> being
>>> created in the history database (with a new ID) containing an exact copy
>>> of
>>> that document e.g. {_id: <new ID>, doc: <exact copy of doc> }.
>> How would you handle case of attachements ? If attachements are copied
>> for each revision of a doc, it would take a lot of place. Maybe
>> storing attachements in their own doc could be solution though. So
>> storing a revision would be
>> store attachements in differents docs
>> create a doc  {_id: <id>, doc: <doc>, attachments: [<id1>, ...]}
>> attachements will be tests across revisions depending of their signature
>> if signature change, a new atatchment doc is created.
>> Just a thought anyway.
> Good idea, the disk space issue would be quite important for larger
> databases with larger number of changes.  I wonder if some kind of
> alternative storage layer supporting diffs would help here.  Probably
> something to consider as a future improvement.
>>> 4. Adding meta-data to changes can be handled by a custom _update handler
>>> (yet to be developed) to set fields such as "last_modified" and
>>> "last_modified_user".

I've been quiet on this thread as I'm largely in agreement with the proposal.

I think the best route for implementation is to allow Erlang callbacks
on changes. This way we can write a simple history function that
copies off each change to a backup db, setting timestamps and userCtx
metadata on the way.

The user interface could surface this function's activation in the
node config as a check box, and applications wouldn't need to know
about it at all. It should be possible to develop a generic futon-like
interface for browsing old documents to revert individual changes, so
users can work with non-backup-aware applications.

As far as keeping track of time ranges when backups are turned off,
the user interface could record a timestamped metadata document to the
backup db whenever the switch is flipped.


>> why not adding date metadata when storing revision . The obvious one I
>> mean userCtx, and date?
> My idea was that userCtx and date could be stored using _update, or do you
> think this should be done automatically?  It's certainly a possibility but I
> wouldn't want to add unnecessary data if the user doesn't need it, although
> I imagine in 99% of cases they would need the "date/time" of the change in
> the history.
>>> One use case we'd like to support is effectively (from the point of the
>>> user) being able to "roll back" a view to a specific point in time, but
>>> how
>>> this would look in the history database has me stumped so far.  Rolling
>>> back
>>> a specific doc is easy, but multiple docs, not so easy it seems.  Any
>>> suggestions welcome!
>> rolling back could be handled on a view based on date in history database
>> ?
> Indeed, but I haven't been able to come up with such a view without blowing
> the reduce limitations.  I want to do something like fetch all the latest
> history docs that were changed before some particular date.  As Jan pointed
> out though, this could be solved using snapshot databases instead.
> --
> Jason Davies

Chris Anderson

View raw message