incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <>
Subject Re: Some guidance with extremely slow indexing
Date Sat, 11 Apr 2009 19:06:12 GMT
On Sat, Apr 11, 2009 at 2:58 PM, Kenneth Kalmer
<> wrote:
> On Thu, Apr 9, 2009 at 5:17 PM, Paul Davis <>wrote:
>> Kenneth,
>> I'm pretty sure you're issue is in the reduce steps for the daily and
>> montly views. The general rule of thumb is that you shouldn't be
>> returning data that grows faster than log(#keys processed) where as I
>> believe your data is growing linearly with input.
>> This particular limitation is a result of the implementation of
>> incremental reductions. Basically, each key/pointer pair stores the
>> re-reduced value for all [re-]reduce values in its children nodes. So
>> as your reduction moves up the tree the data starts exploding which
>> kills btree performance not to mention the extra file I/O.
>> The basic moral of the story is that if you want reduce views like
>> this per user you should emit a [user_id, date] pair as the key and
>> then call your reduce views with group=true.
>> HTH,
>> Paul Davis
> Hi Paul
> Thanks for taking the trouble of investigating for me, I'll dive into the
> views and clean them up a bit according to your advice as well as brush up
> on the caveat you explained. I saw other threads in the archives where you
> gave similar advice, sorry for not realizing I stepped into the same trap.
> When I've got the issue resolved I'll update the gist and we can leave it as
> a point of reference for others.
> Thanks again!

Its kind of a hard one to notice right away as its not an error, it
just kills performance. Perhaps Damien was right in that we should
think about adding log vomiting when we detect that there's a crap
load of data accumulating in the reductions.

Paul Davis

> --
> Kenneth Kalmer

View raw message