Return-Path: X-Original-To: apmail-couchdb-dev-archive@www.apache.org Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2949F80CE for ; Tue, 9 Aug 2011 00:16:51 +0000 (UTC) Received: (qmail 58689 invoked by uid 500); 9 Aug 2011 00:16:50 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 58628 invoked by uid 500); 9 Aug 2011 00:16:49 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 58612 invoked by uid 99); 9 Aug 2011 00:16:49 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Aug 2011 00:16:49 +0000 X-ASF-Spam-Status: No, hits=-2000.8 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Aug 2011 00:16:48 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 65312B2520 for ; Tue, 9 Aug 2011 00:16:27 +0000 (UTC) Date: Tue, 9 Aug 2011 00:16:27 +0000 (UTC) From: "Damien Katz (JIRA)" To: dev@couchdb.apache.org Message-ID: <1721328064.18409.1312848987411.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <572996361.17940.1312840107142.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (COUCHDB-1243) Compact and copy feature that resets changes MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/COUCHDB-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081324#comment-13081324 ] Damien Katz commented on COUCHDB-1243: -------------------------------------- I mostly agree with Robert Newsom, that what you are asking for is a dangerous thing for CouchDB replication. However, there is the purge option, which "forgets" documents, deleted or otherwise, completely removing them from the internal indexes. Once documents are purged, compaction will will completely remove them from the file forever. Unfortunately, I couldn't find actual documentation on the purge functionality, so the best place to figure out how to use the purge is to look at the purge test in the browser test suite, which can be found here: http://svn.apache.org/viewvc/couchdb/trunk/share/www/script/test/purge.js?view=co&revision=1086241&content-type=text%2Fplain I've often thought a it would be useful to purge docs during compaction, by providing a user defined function to signal to remove unwanted docs/stubs. But no such thing exists, in the meantime you can accomplish it with a purge + compaction. > Compact and copy feature that resets changes > -------------------------------------------- > > Key: COUCHDB-1243 > URL: https://issues.apache.org/jira/browse/COUCHDB-1243 > Project: CouchDB > Issue Type: New Feature > Components: Database Core > Affects Versions: 1.0.1, 1.1 > Environment: Ubuntu, but not important > Reporter: Henrik Hofmeister > Labels: cleanup, compaction > Attachments: dump_load.php > > > After running db and view compaction on a 70K doc db with 6+ mio. changes - it takes up 0.8 GB. If copying the same documents to a new db (get and bulk insert) - the same date with 70K changes (only the inserts) takes up 40 mb. That is a huge difference. Has been verified on 2 db's that the difference is more than 65 times the size of data. > A "Compact and copy" feature that copies only documents, and resets the changes for at db would be very nice to try and limit the disk usage a little bit. (Our current test environment takes up nearly 100 GB... ) > I've attached the dump load php script for your convenience. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira