couchdb-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Couchdb Wiki] Update of "History" by JasonDavies
Date Thu, 06 Aug 2009 16:35:41 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Couchdb Wiki" for change notification.

The following page has been changed by JasonDavies:
http://wiki.apache.org/couchdb/History

The comment on the change is:
Updated as per latest discussion on mailing list (no more history db!)

------------------------------------------------------------------------------
  = Proposal for CouchDB history support =
  
-  * We will use a separate database to store history documents.
-  * The main change will be at the couch_db level in the document update functions, so that
*any* change to a document will be recorded in the history database.
-  * Changes to documents will result in a new document (with a new ID) being written to the
history database, of the form:
+  * Every time a document is changed, store the existing document as an attachment before
writing the updated document.
+  * For space efficiency, historical attachments are stored separately i.e. not inline with
the historical JSON document.
+  * The special "history" attachments will be stored using a special prefix of "_history/<_rev>".
+  * If people need to add meta-data to the history, e.g. "last changed by", "last changed
date/time", then the recommended way would be to use a custom _update handler to add these
fields to the doc being saved, and these would propagate to the history attachment.
+  * In future we can add delta support to further improve efficiency.
  
+ == Use cases ==
- {{{{
-   _id: <uuid>,
-   % Think you'd want the previous rev ID
-   previous: <previous revision id>
-   doc: {
-     <original document>
-   }
- }
- }}}
  
+ The main use case we want to support is the ability to recover from catastrophic user errors
e.g. if they delete an important document, or overwrite something important.  I don't think
supporting use cases such as rolling back to particular snapshots is within the scope of this
proposal.
-  * The history database is an ordinary CouchDB database that can be manipulated as normal
e.g. views can be added to organise the history docs as required.
-  * If people need to add meta-data to the history, e.g. "last changed by", "last changed
date/time", then the recommended way would be to use a custom _update handler to add these
fields to the doc being saved, and these would propagate to the history database.
  
-  * Make it easy to "roll back" all docs to a specific point in time.  Viewing how a single
doc looked at a certain point in time is easy, but to get all docs with doc.type == 'profile'
at some point in time, for example, is a bit harder.  Suggestions welcome!
+ == Implementation ==
  
+ Native Erlang patch to core CouchDB.  We probably want the ability to turn this on/off on
a per-db basis via a .ini config option.
  
- I think the write to the history db would need to occur before the write to the main db
is completed and a failure would need to about finalizing the write to the main db.  That
much would give you a CVS-like history of each item in the main DB with a slight change of
branching off a single revision when there was failure between a history write and main db
write.
- 
- To get a horizontal view of the main db at an instance, the record of the current revs for
all documents would need to be written to the history db occasionally (upper limit would be
at the completion of every main db write).  After the initial write of that record, deltas
could be used until it seems wise to rewrite the full index.  It would not be necessary to
block the main db write until that is complete.
- 
- Background tasks could go around delta-ing older documents.  That would require some support
in the main db for a document to be represented as a delta to another document.
- 
- == Potential Use Cases ==
- 
-  * View a single doc at a specific point in time.
-  * Rollback a database to a specific point in time.
-  * Query a view within a specific point in time.
-  * Query a view across a range of time.
-  * Query using a current view on a history snapshot.
-  * Retrieve a log of all modifications to a document within a time range with option to
get documents.
-  * Start recording history on an existing DB.
-  * Stop recording history on an existing DB.
-  * Start recording history on an active DB.
-  * Stop recording history on an active DB.
-  * Detecting that there is a gap in history.
-  * Querying history while DB is active.
-  * Block on history, do not allow update until history is recorded.
-  * Don't block with possible loss of history on crash, etc.
- 
- == Potential approaches ==
- 
-  * Erlang native implementation
-  * Pluggable implementation (history_handler)
-  * Writing some existing repo format (for example Subversion's FSFS)
-  * Integrating with with existing repo library (libsvn?)
-  * Integrating with libsvn and exposing svn's http interface through mochiweb.
- 

Mime
View raw message