couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zdravko Gligic <zgli...@gmail.com>
Subject Re: Some CouchDB internals questions?
Date Wed, 16 Mar 2011 16:53:36 GMT
WOW !

So, how long might it take for this not only to become part of CouchDB
core but then also to get implemented by all of the ohter CouchDB
dialects such as CouchBase and BigCouch ,etc ?

And as dumb as it might sound ;) why was this not done (: the right
way :) from the very beginning ;?)

On Wed, Mar 16, 2011 at 10:02 AM, Filipe David Manana
<fdmanana@apache.org> wrote:
> Zdravko,
>
> Yesterday a performance related ticket was created:
>
> https://issues.apache.org/jira/browse/COUCHDB-1092
>
> Apart from the performance improvements, it also reduces very
> significantly the database sizes (from 2 times less to about 10 times
> less). So you might be interested to follow/read.
>
> On Tue, Mar 15, 2011 at 7:32 PM, Paul Davis <paul.joseph.davis@gmail.com> wrote:
>> On Tue, Mar 15, 2011 at 2:53 PM, Zdravko Gligic <zgligic@gmail.com> wrote:
>>>> Have you compacted your db and views?
>>>
>>> Yes
>>>
>>>> There's unfortunately no direct way to calculate a upper threshold, it
>>>> really depends on your method for inserting as well as how often you
>>>> compact.
>>>
>>> Once both (docs and view) are compacted, is the resulting size at all
>>> dependent on how the docs and/or views were created in the first place
>>> (one at a time or in bulk or whatever) ?
>>>
>>
>> I think to get the absolute minimum post-compaction size you need to
>> compact twice. I haven't done lots of extensive testing on this, but
>> last I recall the basic logic was the first time can end up writing
>> docs in a somewhat randomish ordering depending on how they were
>> inserted.
>>
>>>> This is due to the tail append storage which will orphan data
>>>> in the file as it writes new records to the various internal data
>>>> structures.
>>>
>>> My 1,500 docs are taking up almost 15 meg (roughly 1/4-1k docs with 2
>>> views + 1 view with doc re-emit) and I believe were around 50meg
>>> before compactions.
>>>
>>
>> More importantly, what was the datasize post-compaction though? If
>> your main db is 15Meg, and you have a view that re-emits the doc, I'd
>> expect you to have a total size of at least 30Meg. Depending on what
>> you're emitting in the other two views getting closer to that 50 isn't
>> hugely out of the question.
>>
>
>
>
> --
> Filipe David Manana,
> fdmanana@gmail.com, fdmanana@apache.org
>
> "Reasonable men adapt themselves to the world.
>  Unreasonable men adapt the world to themselves.
>  That's why all progress depends on unreasonable men."
>

Mime
View raw message