Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 57784 invoked from network); 20 Feb 2009 17:05:59 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 20 Feb 2009 17:05:59 -0000 Received: (qmail 10469 invoked by uid 500); 20 Feb 2009 17:05:57 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 10422 invoked by uid 500); 20 Feb 2009 17:05:57 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 10411 invoked by uid 99); 20 Feb 2009 17:05:57 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Feb 2009 09:05:57 -0800 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [83.97.50.139] (HELO jan.prima.de) (83.97.50.139) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Feb 2009 17:05:49 +0000 Received: from dahlia.lan (f053003057.adsl.alicedsl.de [::ffff:78.53.3.57]) (AUTH: LOGIN jan, TLS: TLSv1/SSLv3,128bits,AES128-SHA) by jan.prima.de with esmtp; Fri, 20 Feb 2009 17:05:26 +0000 Message-Id: <9509F891-97E7-478C-9AF9-4663B1352D97@apache.org> From: Jan Lehnardt To: user@couchdb.apache.org In-Reply-To: <499EDD65.9060800@yahoo.fr> Content-Type: text/plain; charset=WINDOWS-1252; format=flowed; delsp=yes Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: What am I doing wrong? Date: Fri, 20 Feb 2009 18:04:54 +0100 References: <499EB832.7040704@yahoo.fr> <98EB4A72-76A5-4A9B-BD8C-6F04D26F3D88@mooseyard.com> <46aeb24f0902200823j77134ce2h4aeb84b1289d6863@mail.gmail.com> <499EDD65.9060800@yahoo.fr> X-Mailer: Apple Mail (2.930.3) X-Virus-Checked: Checked by ClamAV on apache.org On 20 Feb 2009, at 17:42, Pascal Borghino wrote: > Hi there, I do not have attachments... > > $ ls -lh > -rw-r--r-- 1 root root 83G Feb 20 02:40 test.couch > -rw-r--r-- 1 root root 23G Feb 20 16:33 test.couch.compact > > $ du -sh > 107G . > > still... from 19Go to 83Go... huge difference. > P. The fact that there is a .compact file means that compaction is still running (or was aborted). When you restart it, you should see it in the "Status" section of Futon and how far along it is. Compaction will continue where it left off. Please let us know what the final database file size is when compaction is finished. If you did an insertion of a lot of single documents, quite extensive sparseness can occur. On large imports, do use bulk inserts (see the wiki) or if that is not possible, compact every once in a while during the import. Cheers Jan -- > > > > Robert Newson a =E9crit : >> I expect the b-tree wastage is minimal (though not zero). >> >> I've wondered what happens on filesystems that don't support sparse >> files, I assume they'd just be slower and use more disk space. Given >> that the holes vanish after compaction, I suspected a bad calculation >> in the code (couch_db.erl, I think), but I've not found it, it seems >> to do the right thing. HFS+ doesn't support holes but I'm pretty sure >> NTFS does. >> >> Btw, it's mostly around attachments. If you add lots of documents but >> no attachments, ls and df are in close agreement. >> >> B. >> >> On Fri, Feb 20, 2009 at 4:00 PM, Jens Alfke =20 >> wrote: >> >>> On Feb 20, 2009, at 6:03 AM, Pascal Borghino wrote: >>> >>> >>>> I am currently compacting it... even if 'Compaction rewrites the =20= >>>> database >>>> file, removing outdated document revisions and deleted =20 >>>> documents'... no >>>> document should be outdate neither deleted... >>>> >>> In addition to the sparseness of the file, another reason for the =20= >>> size >>> difference might be obsolete b-tree nodes. The file is append-=20 >>> only, so any >>> time a b-tree changes, the old nodes remain in the file. If you've =20= >>> done a >>> large number of individual insertions, that space might be =20 >>> significant. >>> (Probably not gigabytes, though.) >>> >>> >>> robert.newson@gmail.com wrote: >>> >>> >>>> I find the actual >>>> consumed space is far, far less that 'ls' shows. CouchDB .couch =20 >>>> files >>>> are very sparse, large gaps of unwritten data, ostensibly to keep >>>> btree and document items separate, but these 'holes' vanish after >>>> compaction, even if you have zero updates and deletes. >>>> >>> Hm. But not all filesystems support sparse files. HFS+, the Mac OS >>> filesystem, doesn't. (Does NTFS?) Is there an option to suppress =20 >>> the gaps? >>> >>> =97Jens >>> >> >> > > >