couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Filipe David Manana <>
Subject Re: Some CouchDB internals questions?
Date Thu, 17 Mar 2011 08:56:42 GMT
On Wed, Mar 16, 2011 at 4:53 PM, Zdravko Gligic <> wrote:
> WOW !
> So, how long might it take for this not only to become part of CouchDB
> core but then also to get implemented by all of the ohter CouchDB
> dialects such as CouchBase and BigCouch ,etc ?

Hopefully shouldn't take long to land into Apache CouchDB's trunk, as
so far none of the existing features/components are affected with a
performance drop, and neither it changes any API. The change itself is
relatively simple as well.

For Bigcouch, I can't tell, you should ask Cloudant people (Adam, Robert, etc).

> And as dumb as it might sound ;) why was this not done (: the right
> way :) from the very beginning ;?)

Have no idea. I've only been in the community for little more than 1 year.
I would assume that initially developers were more worried about
correctness and elegant APIs, which is perfectly reasonable and sane -
performance should come after.


> On Wed, Mar 16, 2011 at 10:02 AM, Filipe David Manana
> <> wrote:
>> Zdravko,
>> Yesterday a performance related ticket was created:
>> Apart from the performance improvements, it also reduces very
>> significantly the database sizes (from 2 times less to about 10 times
>> less). So you might be interested to follow/read.
>> On Tue, Mar 15, 2011 at 7:32 PM, Paul Davis <> wrote:
>>> On Tue, Mar 15, 2011 at 2:53 PM, Zdravko Gligic <> wrote:
>>>>> Have you compacted your db and views?
>>>> Yes
>>>>> There's unfortunately no direct way to calculate a upper threshold, it
>>>>> really depends on your method for inserting as well as how often you
>>>>> compact.
>>>> Once both (docs and view) are compacted, is the resulting size at all
>>>> dependent on how the docs and/or views were created in the first place
>>>> (one at a time or in bulk or whatever) ?
>>> I think to get the absolute minimum post-compaction size you need to
>>> compact twice. I haven't done lots of extensive testing on this, but
>>> last I recall the basic logic was the first time can end up writing
>>> docs in a somewhat randomish ordering depending on how they were
>>> inserted.
>>>>> This is due to the tail append storage which will orphan data
>>>>> in the file as it writes new records to the various internal data
>>>>> structures.
>>>> My 1,500 docs are taking up almost 15 meg (roughly 1/4-1k docs with 2
>>>> views + 1 view with doc re-emit) and I believe were around 50meg
>>>> before compactions.
>>> More importantly, what was the datasize post-compaction though? If
>>> your main db is 15Meg, and you have a view that re-emits the doc, I'd
>>> expect you to have a total size of at least 30Meg. Depending on what
>>> you're emitting in the other two views getting closer to that 50 isn't
>>> hugely out of the question.
>> --
>> Filipe David Manana,
>> "Reasonable men adapt themselves to the world.
>>  Unreasonable men adapt the world to themselves.
>>  That's why all progress depends on unreasonable men."

Filipe David Manana,,

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."

View raw message