couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Macdonald <macfisher...@gmail.com>
Subject Re: specifying an _id results in a much smaller DB?
Date Tue, 26 May 2009 22:10:52 GMT
On Tue, May 26, 2009 at 5:36 PM, Chris Anderson <jchris@apache.org> wrote:

> On Tue, May 26, 2009 at 2:31 PM, Jeff Macdonald <macfisherman@gmail.com>
> wrote:
> > Hi all,
> > I've been experimenting with CouchDB. I'm use Net::CouchDB to batch
> insert
> > 20 docs at a time and I'm simply setting _id to a sequence that is
> > incremented for each doc. For just over 9 million rows where each row is
> > just 6 small fields the resulting DB is 3.4G. When I was letting CouchDB
> set
> > the _id, the resulting database was over 20G. The input source as a tab
> > delimited file is just over 500MB.
> >
> > So is it normal for CouchDB to create such a large database file when it
> > assigns document ids?
> >
>
> yes, currently couchdb docids are random which means more of the btree
> must be rewritten, than if they were concentrated, such as you see
> with sequential ids. for high performance applications, sequential ids
> is faster as well.
>
> Compacting may shrink your databases so they are roughly equal size.
> You an trigger compaction from Futon. I'd be interested to see what
> results you get.


Well, it took over a day to do it before. I was however only inserting 10
docs at a time then. So, right now I'm not motivated to find out how well
the compaction would be. :)



-- 
Jeff Macdonald
Ayer, MA

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message