couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Anderson <>
Subject Re: Tail Append Headers
Date Thu, 21 May 2009 05:49:30 GMT
On Wed, May 20, 2009 at 9:48 AM, Adam Kocoloski <> wrote:
> On May 20, 2009, at 11:34 AM, Damien Katz wrote:
>> On May 20, 2009, at 11:26 AM, Paul Davis wrote:
>>> On Wed, May 20, 2009 at 11:22 AM, Damien Katz <> wrote:
>>>> On May 20, 2009, at 11:09 AM, Damien Katz wrote:
>>>>> Previously, only btree nodes were saved compressed and docs were not.
>>>>> didn't realize the compression was so expensive, but now that I switch
>>>>> it
>>>>> off on both the branch and on trunk, I see big performance boosts for
>>>>> both.
>>>>> And now the tail append stuff is slightly faster on my machine.
>>>> To clarify, disabling the compression completely on both trunk and the
>>>> branch results in big performance increases for both, with the
>>>> tail_header
>>>> branch now being slightly faster than trunk running the lightning test
>>>> on my
>>>> machine.
>>>> -Damien
>>> Awesome. Is there a noticeable size difference on the database files?
>> It looks to take about 2x as much diskspace as without compression.
> Nice find.  I also see the the tail_header branch slightly faster than trunk
> with compression turned off on both, and the DB size increased by ~2x.  For
> kicks I tried turning the compression level down to 1 (default is 6 on a 1-9
> scale).  Running hovercraft:lightning() gives me
> compression level   insert rate     db size
> 0                   11725           16.7MB
> 1                    4186            8.2MB
> 6 (default)          3938            7.8MB

This is a really cool chart. It'd be fun to keep a metric of this over time.

I'm getting around 9k docs/sec on hovercraft:lightning() on the append
branch which is a substantial step up over trunk, which runs closer to
4.5k docs/sec.

Trunk with compression off gives me 5.5 - 6k doc/sec, so the
tail_append is clearly faster. I wonder what a compressing filesystem
would cost us in performance?

Good work on this branch Damien. I'm pretty impressed by how quickly
you put it together.

> So it's still a huge cost.  The nice thing is that binary_to_term seems
> perfectly happy reading a mix of compressed and uncompressed binaries, which
> means the compression level can be a configuration parameter if we want it
> to be.  gzip decompresses pretty quickly, so I'm guessing that reading a
> compressed DB will be faster than an uncompressed one.  We'll have to
> measure it, though.
> Adam

Chris Anderson

View raw message