incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From afters <afters.m...@gmail.com>
Subject Re: Getting reduce overflow error
Date Thu, 17 Jun 2010 17:49:34 GMT
On 17 June 2010 20:06, J Chris Anderson <jchris@gmail.com> wrote:

>
> On Jun 17, 2010, at 9:29 AM, afters wrote:
>
> > On 17 June 2010 18:10, J Chris Anderson <jchris@gmail.com> wrote:
> >
> >>
> >> The reduce-limit is a general heuristic, because some very bad reduces
> will
> >> actually grow asymptotically so that the full reduce contains as much
> data
> >> as the entire group=true reduce. It sounds like yours is OK (large but
> not
> >> growing) so you are probably fine (although keeping 4kb of stuff in the
> >> intermediate reduction value storage is going to kill performance.
> >
> >
> > I could limit it to 1kb perhaps - at this point it doesn't matter too
> much.
> > I imagine it would still maim, if not kill, performance. Correct?
>
> I bet 1kb will be more than 4 times faster than 4kb, so it's worth a shot.
> But I'm guess you are probably better off in terms of scalability to have a
> lean reduce index, and use the results from that to know which document to
> fetch.
>
> OTOH if you are gonna be working only with smaller data sets, then you may
> even be fine with what you've got. Just be aware that with large reductions
> (especially reductions that are giant when called without group=true) you
> are introducing a bunch of overhead, and things will slow down as your
> database grows.


>
Is it correct that reductions spread up the b-tree only as high as needed to
satisfy the group-level demands?


> If you keep your reduces simple, like _sum and _count, or similar data
> structures, you should be fine.
>
> Read this for a survey of reduction techniques that can scale
> http://labs.google.com/papers/sawzall.html
>
>
I will look into that. Thanks.


> >
> > Any way to break it up and maybe use the reduce to know which document to
> >> query to get the big blob of text?
> >>
> >>
> > I could certainly do that. Indeed my original plan, before discovering
> the
> > magic of 'group=true', was to fetch each piece of entity-data separately.
> >
> > a.
> >
> >
> >> Chris
> >>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message