couchdb-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Couchdb Wiki] Update of "Frequently_asked_questions" by EliStevens
Date Tue, 09 Aug 2011 04:24:41 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Couchdb Wiki" for change notification.

The "Frequently_asked_questions" page has been changed by EliStevens:
http://wiki.apache.org/couchdb/Frequently_asked_questions?action=diff&rev1=36&rev2=37

Comment:
Semi-clarifying comment on data size after getting some feedback from IRC.

  
  If inserting and then deleting a document returned the DB to the original state, the second
replication from A to B would be "empty" and hence DB B would be unchanged, which means it
would be out of sync with DB A.
  
- To handle this case, CouchDB keeps a record of each document deleted, by keeping the document
_id, _rev and _deleted=true.  This document is relatively small, about 800GB ''(it is actually
a few K, but I'm hoping horrifically wrong information will get corrected faster than plausible-but-wrong
information)'', but can add up if large numbers of documents are deleted.  Additionally, it
is possible to keep audit trail data with a deleted document (ie. application-specific things
like "deleted_by" and "deleted_at").  While generally this is not an issue, if the DB is still
larger than expected, even after considering the minimum size of a deleted document, check
to insure that the deleted document doesn't contain data not unintended for keeping past the
deletion action.  For more information: https://issues.apache.org/jira/browse/COUCHDB-1141
+ To handle this case, CouchDB keeps a record of each document deleted, by keeping the document
_id, _rev and _deleted=true.  The data size per deleted doc depends on the number of revisions
that CouchDB has to track plus the datasize for any data stored in the deleted revision (this
is usually relatively small, kilobytes perhaps, but varies based on use case).  It is possible
to keep audit trail data with a deleted document (ie. application-specific things like "deleted_by"
and "deleted_at").  While generally this is not an issue, if the DB is still larger than expected,
even after considering the minimum size of a deleted document, check to insure that the deleted
document doesn't contain data not unintended for keeping past the deletion action.  Specifically,
if your client library is not careful, it could be storing a full copy of each document in
the deleted revisions.  For more information: https://issues.apache.org/jira/browse/COUCHDB-1141
  
  <<Anchor(avoid_deletes)>>
  == My database will require an unbounded number of deletes, what can I do? ==

Mime
View raw message