couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Randall Leeds (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (COUCHDB-1243) Compact and copy feature that resets changes
Date Wed, 10 Aug 2011 20:43:27 GMT

    [ https://issues.apache.org/jira/browse/COUCHDB-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082657#comment-13082657
] 

Randall Leeds commented on COUCHDB-1243:
----------------------------------------

If a smaller _revs_limit doesn't fix your problem then it sounds like you have some documents
that are in conflict. The best way I can think to automate purging the conflicts would be
to consume the /_changes feed with ?style=all_docs. Each entry in the feed will include an
array of revisions in the 'changes' property. The first of these is the winning conflict revision.
Then use /_purge to remove all but this winning revision and you'll be left with only the
history of the winning version. If you only consume the _changes feed up to a sequence number
before the stable replication checkpoints you won't be destroying revisions that haven't replicated
yet and replication should continue to function. Additionally, documents that haven't been
in conflict much but have received many updates will still have history back to _revs_limit
and should replicate safely, without introducing new conflicts, so long as they haven't received
a number of divergent updates.

Paul's caveats about _purge and view indexes applies.

> Compact and copy feature that resets changes
> --------------------------------------------
>
>                 Key: COUCHDB-1243
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1243
>             Project: CouchDB
>          Issue Type: New Feature
>          Components: Database Core
>    Affects Versions: 1.0.1, 1.1
>         Environment: Ubuntu, but not important
>            Reporter: Henrik Hofmeister
>              Labels: cleanup, compaction
>         Attachments: dump_load.php
>
>
> After running db and view compaction on a 70K doc db with 6+ mio. changes - it takes
up 0.8 GB. If copying the same documents to a new db (get and bulk insert) - the same date
with 70K changes (only the inserts) takes up 40 mb. That is a huge difference. Has been verified
on 2 db's that the difference is more than 65 times the size of data.
> A "Compact and copy" feature that copies only documents, and resets the changes for at
db would be very nice to try and limit the disk usage a little bit. (Our current test environment
takes up nearly 100 GB... )
> I've attached the dump load php script for your convenience.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message