Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 49959 invoked from network); 20 Feb 2009 16:41:24 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 20 Feb 2009 16:41:24 -0000 Received: (qmail 81335 invoked by uid 500); 20 Feb 2009 16:41:20 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 81303 invoked by uid 500); 20 Feb 2009 16:41:20 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 81284 invoked by uid 99); 20 Feb 2009 16:41:20 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Feb 2009 08:41:20 -0800 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [83.97.50.139] (HELO jan.prima.de) (83.97.50.139) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Feb 2009 16:41:12 +0000 Received: from dahlia.lan (f053003057.adsl.alicedsl.de [::ffff:78.53.3.57]) (AUTH: LOGIN jan, TLS: TLSv1/SSLv3,128bits,AES128-SHA) by jan.prima.de with esmtp; Fri, 20 Feb 2009 16:40:33 +0000 Message-Id: From: Jan Lehnardt To: user@couchdb.apache.org In-Reply-To: <46aeb24f0902200823j77134ce2h4aeb84b1289d6863@mail.gmail.com> Content-Type: text/plain; charset=WINDOWS-1252; format=flowed; delsp=yes Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: What am I doing wrong? Date: Fri, 20 Feb 2009 17:40:49 +0100 References: <499EB832.7040704@yahoo.fr> <98EB4A72-76A5-4A9B-BD8C-6F04D26F3D88@mooseyard.com> <46aeb24f0902200823j77134ce2h4aeb84b1289d6863@mail.gmail.com> X-Mailer: Apple Mail (2.930.3) X-Virus-Checked: Checked by ClamAV on apache.org On 20 Feb 2009, at 17:23, Robert Newson wrote: > I expect the b-tree wastage is minimal (though not zero). actually not, only after compaction. Cheers Jan -- > I've wondered what happens on filesystems that don't support sparse > files, I assume they'd just be slower and use more disk space. Given > that the holes vanish after compaction, I suspected a bad calculation > in the code (couch_db.erl, I think), but I've not found it, it seems > to do the right thing. HFS+ doesn't support holes but I'm pretty sure > NTFS does. > > Btw, it's mostly around attachments. If you add lots of documents but > no attachments, ls and df are in close agreement. > > B. > > On Fri, Feb 20, 2009 at 4:00 PM, Jens Alfke =20 > wrote: >> >> On Feb 20, 2009, at 6:03 AM, Pascal Borghino wrote: >> >>> I am currently compacting it... even if 'Compaction rewrites the =20= >>> database >>> file, removing outdated document revisions and deleted =20 >>> documents'... no >>> document should be outdate neither deleted... >> >> In addition to the sparseness of the file, another reason for the =20 >> size >> difference might be obsolete b-tree nodes. The file is append-only, =20= >> so any >> time a b-tree changes, the old nodes remain in the file. If you've =20= >> done a >> large number of individual insertions, that space might be =20 >> significant. >> (Probably not gigabytes, though.) >> >> >> robert.newson@gmail.com wrote: >> >>> I find the actual >>> consumed space is far, far less that 'ls' shows. CouchDB .couch =20 >>> files >>> are very sparse, large gaps of unwritten data, ostensibly to keep >>> btree and document items separate, but these 'holes' vanish after >>> compaction, even if you have zero updates and deletes. >> >> Hm. But not all filesystems support sparse files. HFS+, the Mac OS >> filesystem, doesn't. (Does NTFS?) Is there an option to suppress =20 >> the gaps? >> >> =97Jens >