couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Anderson <jch...@apache.org>
Subject Re: specifying an _id results in a much smaller DB?
Date Tue, 26 May 2009 21:36:06 GMT
On Tue, May 26, 2009 at 2:31 PM, Jeff Macdonald <macfisherman@gmail.com> wrote:
> Hi all,
> I've been experimenting with CouchDB. I'm use Net::CouchDB to batch insert
> 20 docs at a time and I'm simply setting _id to a sequence that is
> incremented for each doc. For just over 9 million rows where each row is
> just 6 small fields the resulting DB is 3.4G. When I was letting CouchDB set
> the _id, the resulting database was over 20G. The input source as a tab
> delimited file is just over 500MB.
>
> So is it normal for CouchDB to create such a large database file when it
> assigns document ids?
>

yes, currently couchdb docids are random which means more of the btree
must be rewritten, than if they were concentrated, such as you see
with sequential ids. for high performance applications, sequential ids
is faster as well.

Compacting may shrink your databases so they are roughly equal size.
You an trigger compaction from Futon. I'd be interested to see what
results you get.


> --
> Jeff Macdonald
> Ayer, MA
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Mime
View raw message