incubator-couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Damien Katz (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (COUCHDB-1243) Compact and copy feature that resets changes
Date Tue, 09 Aug 2011 00:16:27 GMT

    [ https://issues.apache.org/jira/browse/COUCHDB-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081324#comment-13081324
] 

Damien Katz commented on COUCHDB-1243:
--------------------------------------

I mostly agree with Robert Newsom, that what you are asking for is a dangerous thing for CouchDB
replication. However, there is the purge option, which "forgets" documents, deleted or otherwise,
completely removing them from the internal indexes. Once documents are purged, compaction
will will completely remove them from the file forever. Unfortunately, I couldn't find actual
documentation on the purge functionality, so the best place to figure out how to use the purge
is to look at the purge test in the browser test suite, which can be found here:

http://svn.apache.org/viewvc/couchdb/trunk/share/www/script/test/purge.js?view=co&revision=1086241&content-type=text%2Fplain

I've often thought a it would be useful to purge docs during compaction, by providing a user
defined function to signal to remove unwanted docs/stubs. But no such thing exists, in the
meantime you can accomplish it with a purge + compaction.

> Compact and copy feature that resets changes
> --------------------------------------------
>
>                 Key: COUCHDB-1243
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1243
>             Project: CouchDB
>          Issue Type: New Feature
>          Components: Database Core
>    Affects Versions: 1.0.1, 1.1
>         Environment: Ubuntu, but not important
>            Reporter: Henrik Hofmeister
>              Labels: cleanup, compaction
>         Attachments: dump_load.php
>
>
> After running db and view compaction on a 70K doc db with 6+ mio. changes - it takes
up 0.8 GB. If copying the same documents to a new db (get and bulk insert) - the same date
with 70K changes (only the inserts) takes up 40 mb. That is a huge difference. Has been verified
on 2 db's that the difference is more than 65 times the size of data.
> A "Compact and copy" feature that copies only documents, and resets the changes for at
db would be very nice to try and limit the disk usage a little bit. (Our current test environment
takes up nearly 100 GB... )
> I've attached the dump load php script for your convenience.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message