couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Anderson <jch...@apache.org>
Subject Re: Possible bug in indexer... (really)
Date Fri, 03 Jul 2009 22:37:51 GMT
2009/7/3 Göran Krampe <goran@krampe.se>:
> Hi folks!
>
> We are writing an app using CouchDB where we tried to do some map/reduce to
> calculate "period sums" for about 1000 different "accounts". This is fiscal
> data btw, the system is meant to store detailed fiscal data for about 50000
> companies, for starters. :)
>
> The map function is trivial, it just emits a bunch of "accountNo, amount"
> pairs with "month" as key.
>
> The reduce/rereduce take these and builds a dictionary (JSON object) with
> "month-accountNo" as key (like "2009/10-2335" and the sum as the value. This
> works fine, yes, it builds up a bit but there is a maximum of account
> numbers and months so it doesn't grow out of control, so that is NOT the
> issue.

There is *no reason ever* to build up a dictionary with more then a
small handful of items in it. Eg it's ok if your dictionary has this
fixed set of keys: count, total, stddev, avg.

It's not OK to do what you are doing. This is what group_level is for.
Rewrite your map reduce to be correct and then we can start talking
about performance.

I don't mean to be harsh but suggesting you have a performance problem
here is like me complaining that my Ferrari makes a bad boat.

Cheers,
Chris




>
> Ok, here comes the punchline. When we dump the first 1000 docs using bulk,
> which typically will amount to say 5000 emits - and we "touch" the view to
> trigger it - it will be rather fast and behaves like this:
>
> - a single Erlang process runs and emits all values, then it does a bunch or
> reduce on those values and finally it switches into rereduce mode and does
> those and then you can see the dictionary "growing" a bit but never too
> much. It is pretty fast, a second or two all in all.
>
> Fine. Them we dump the *next* 1000 docs into Couch and triggers the view
> again. This time it behaves like this (believe it or not):
>
> - two Erlang processes get into play. It seems the same process as above
> continues with emits (IIRC) but a second one starts doing reduce/rereduce
> *while the first one is emitting*. Ouch. And to make it worse - the second
> one seems to gradually "take over" until we only see 2-3 emits followed by
> tons of rereduces (all the way up I guess for each emit).
>
> Sooo... evidently Couch decides to do stuff in parallell and starts doing
> reduce/rereduce while emitting here. AFAIK this is not the behavior
> described. The net effect is that the view update that took 1-2 seconds
> suddenly takes 400 seconds or goes to a total crawl and never seems to end.
>
> By looking at the log it obviously processes ONE doc at a time - giving us
> 2-5 emits typically and then tries to reduce that all the way up to the root
> before processing the next doc. So the rereduces for the internal nodes will
> be run typically in this case 1000x more than needed.
>
> Phew. :) Ok, so we are basically hosed with this behavior in this situation.
> I can only presume this has gone unnoticed because:
>
> a) Updates most of us do are small. But we dump thousands of new docs using
> bulk (a full new fiscal year of data for a given company) so we definitely
> notice it.
>
> b) Most reduce/rereduce functions are very, very fast. So it goes unnoticed.
> Our functions are NOT that fast - but if they were only run as they should
> (well, presuming they *should* only be run after all the emits for all doc
> changes in a given view update) it would indeed be fast anyway. We can see
> that since the first 1000 docs work fine.
>
> ...and thanks to the people on #couchdb for discussing this with me earlier
> today and looking at the Erlang code to try to figure it out. I think Adam
> Kocolski and Robert Newson had some idea about it.
>
> regards, Göran
>
> PS. I am on vacation now for 4 weeks, so I will not be answering much email.
> I wanted to get this posted though since it is in some sense a rather ...
> serious performance bottleneck.
>
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Mime
View raw message