From user-return-3716-apmail-couchdb-user-archive=couchdb.apache.org@couchdb.apache.org Thu Feb 26 13:14:21 2009 Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 8634 invoked from network); 26 Feb 2009 13:14:21 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 26 Feb 2009 13:14:21 -0000 Received: (qmail 40309 invoked by uid 500); 26 Feb 2009 13:14:19 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 40005 invoked by uid 500); 26 Feb 2009 13:14:18 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 39994 invoked by uid 99); 26 Feb 2009 13:14:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Feb 2009 05:14:18 -0800 X-ASF-Spam-Status: No, hits=1.5 required=10.0 tests=SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy includes SPF record at spf.trusted-forwarder.org) Received: from [87.248.110.138] (HELO n21.bullet.mail.ukl.yahoo.com) (87.248.110.138) by apache.org (qpsmtpd/0.29) with SMTP; Thu, 26 Feb 2009 13:14:08 +0000 Received: from [217.12.4.214] by n21.bullet.mail.ukl.yahoo.com with NNFMP; 26 Feb 2009 13:13:46 -0000 Received: from [87.248.111.144] by t1.bullet.ukl.yahoo.com with NNFMP; 26 Feb 2009 13:13:46 -0000 Received: from [127.0.0.1] by omp201.mail.ukl.yahoo.com with NNFMP; 26 Feb 2009 13:13:46 -0000 X-Yahoo-Newman-Id: 11647.88400.bm@omp201.mail.ukl.yahoo.com Received: (qmail 97289 invoked from network); 26 Feb 2009 13:13:45 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.fr; h=Received:X-YMail-OSG:X-Yahoo-Newman-Property:Message-ID:Date:From:User-Agent:MIME-Version:To:Subject:References:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=tLtfVv2FVDmltAD+1wVxKF8xRlDN6K9BxT/AWSvwmkHDSwB7j/mlxlZwdgGzaEIZwo3E7iehPPsYBrVMcah8W9obHUBSx72Br8MP/NIqqUfYl41CDLVV3ZX6iqsr/xlkO0MyIQdOd+NyHhM+1JSJGwXu2GlGZ/8jsPJyeuBCtKg= ; Received: from unknown (HELO ?10.76.44.239?) (pborghino@212.82.118.50 with plain) by smtp102.mail.ukl.yahoo.com with SMTP; 26 Feb 2009 13:13:45 -0000 X-YMail-OSG: rjaOmEQVM1mz4DDJr3jhqsY5qmAHfdYtPhT74vQFTI9AclOZMiyDGOM.yzWIF2q8duA.JIqUVLYgBaqAtI8DtE3CIlTrzgaeTOX4HV1.rLtAp1kMBbg8AniqJOtsqArbaJgmWiXCv5v8gVx2oK4R1BIxY.0I4C3izbBdIlZuLmnpOkV7H.ujDqyjysE- X-Yahoo-Newman-Property: ymail-3 Message-ID: <49A69588.6070406@yahoo.fr> Date: Thu, 26 Feb 2009 14:13:44 +0100 From: Pascal Borghino User-Agent: Thunderbird 2.0.0.19 (Macintosh/20081209) MIME-Version: 1.0 To: user@couchdb.apache.org Subject: Re: What am I doing wrong? References: <499EB832.7040704@yahoo.fr> <98EB4A72-76A5-4A9B-BD8C-6F04D26F3D88@mooseyard.com> <46aeb24f0902200823j77134ce2h4aeb84b1289d6863@mail.gmail.com> <499EDD65.9060800@yahoo.fr> <9509F891-97E7-478C-9AF9-4663B1352D97@apache.org> <6D0F6F4A-34BB-496F-BADA-F2E5AC1EB3D8@apache.org> In-Reply-To: <6D0F6F4A-34BB-496F-BADA-F2E5AC1EB3D8@apache.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked by ClamAV on apache.org Hi guys, just to keep you updated. I ran out of space last time while trying to compress my 83Go file... I re run the test with half the amount of docs, 2.3M -rw-r--r-- 1 root root 24G Feb 26 01:45 test.couch {"db_name":"test","doc_count":2219598,"doc_del_count":0,"update_seq":2219598,"purge_seq":0,"compact_running":false,"disk_size":25017692071,"instance_start_time":"1235590552047908"} curl -X POST http://localhost:5984/test/_compact -rw-r--r-- 1 root root 17G Feb 26 13:00 test.couch it took 3 hours to do the compression.... but we won 7Go back about 30%... quite big P. > Also, 0.8.1 compaction has a hard time compacting big dbs. Trunk is > better. > > -Damien > > > On Feb 20, 2009, at 12:04 PM, Jan Lehnardt wrote: > >> >> On 20 Feb 2009, at 17:42, Pascal Borghino wrote: >> >>> Hi there, I do not have attachments... >>> >>> $ ls -lh >>> -rw-r--r-- 1 root root 83G Feb 20 02:40 test.couch >>> -rw-r--r-- 1 root root 23G Feb 20 16:33 test.couch.compact >>> >>> $ du -sh >>> 107G . >>> >>> still... from 19Go to 83Go... huge difference. >>> P. >> >> The fact that there is a .compact file means that compaction >> is still running (or was aborted). When you restart it, you >> should see it in the "Status" section of Futon and how far >> along it is. Compaction will continue where it left off. Please >> let us know what the final database file size is when compaction >> is finished. >> >> If you did an insertion of a lot of single documents, quite >> extensive sparseness can occur. On large imports, do >> use bulk inserts (see the wiki) or if that is not possible, >> compact every once in a while during the import. >> >> Cheers >> Jan >> -- >> >> >>> >>> >>> >>> Robert Newson a écrit : >>>> I expect the b-tree wastage is minimal (though not zero). >>>> >>>> I've wondered what happens on filesystems that don't support sparse >>>> files, I assume they'd just be slower and use more disk space. Given >>>> that the holes vanish after compaction, I suspected a bad calculation >>>> in the code (couch_db.erl, I think), but I've not found it, it seems >>>> to do the right thing. HFS+ doesn't support holes but I'm pretty sure >>>> NTFS does. >>>> >>>> Btw, it's mostly around attachments. If you add lots of documents but >>>> no attachments, ls and df are in close agreement. >>>> >>>> B. >>>> >>>> On Fri, Feb 20, 2009 at 4:00 PM, Jens Alfke >>>> wrote: >>>> >>>>> On Feb 20, 2009, at 6:03 AM, Pascal Borghino wrote: >>>>> >>>>> >>>>>> I am currently compacting it... even if 'Compaction rewrites the >>>>>> database >>>>>> file, removing outdated document revisions and deleted >>>>>> documents'... no >>>>>> document should be outdate neither deleted... >>>>>> >>>>> In addition to the sparseness of the file, another reason for the >>>>> size >>>>> difference might be obsolete b-tree nodes. The file is >>>>> append-only, so any >>>>> time a b-tree changes, the old nodes remain in the file. If you've >>>>> done a >>>>> large number of individual insertions, that space might be >>>>> significant. >>>>> (Probably not gigabytes, though.) >>>>> >>>>> >>>>> robert.newson@gmail.com wrote: >>>>> >>>>> >>>>>> I find the actual >>>>>> consumed space is far, far less that 'ls' shows. CouchDB .couch >>>>>> files >>>>>> are very sparse, large gaps of unwritten data, ostensibly to keep >>>>>> btree and document items separate, but these 'holes' vanish after >>>>>> compaction, even if you have zero updates and deletes. >>>>>> >>>>> Hm. But not all filesystems support sparse files. HFS+, the Mac OS >>>>> filesystem, doesn't. (Does NTFS?) Is there an option to suppress >>>>> the gaps? >>>>> >>>>> —Jens >>>>> >>>> >>>> >>> >>> >>> >> > >