couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Davies (JIRA)" <j...@apache.org>
Subject [jira] Commented: (COUCHDB-69) Allow selective retaining of older revisions to a document
Date Sun, 23 Aug 2009 10:43:59 GMT

    [ https://issues.apache.org/jira/browse/COUCHDB-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746585#action_12746585
] 

Jason Davies commented on COUCHDB-69:
-------------------------------------

Thanks for the excellent feedback Damien.

Firstly, I thought that adding the old revinfos to the by_seq index would be a relatively
cheap operation, as we are simply taking them from #doc_info{revs=Revs}, and this is already
done for conflicts and deleted revs.  So I don't understand why my patch involves *reading*
meta data for every old rev from the disk when updating the by_seq index i.e. in this line:

HistoryRevInfos = if HistoryEnabled -> [{Rev, Seq, Bp} ||
        #rev_info{rev=Rev,seq=Seq,historical=true,body_sp=Bp} <- Revs];
        true -> [] end,

But yes, it does mean that we are *storing* potentially 1000s of revs in this index now, so
I'm sure that would impact performance when updating the index.  With your second point in
mind, I'm wondering whether it would be more efficient to never delete old entries from the
by_seq index (unless we purge), thus saving us having to store all the old revs each time.

In response to your last point about multiple copies of attachments, this could be solved
if we kept an index of attachments by MD5 hash.  I believe rnewson is keen to get something
like this in, so that if you upload the same attachment to multiple docs, it only needs to
store it once.

> Allow selective retaining of older revisions to a document
> ----------------------------------------------------------
>
>                 Key: COUCHDB-69
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-69
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>         Environment: All
>            Reporter: Jan Lehnardt
>            Assignee: Paul Joseph Davis
>            Priority: Minor
>             Fix For: 0.10
>
>         Attachments: history_revs.2.patch, history_revs.3.patch, history_revs.4.patch,
history_revs.5.patch, history_revs.patch
>
>
> At the moment, compaction gets rid of all old revisions of a document. Also, replication
also deals with the latest revision. It would be nice if it would be possible to specify a
list of revisions to keep around that do not get compacted away and replicated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message