couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Crowell <robccrow...@gmail.com>
Subject Re: Email statistiscs : using reduce for uniques
Date Wed, 15 Oct 2014 22:29:46 GMT
I've not worked with CouchDB for a year or so now, but back when I was
using it I ran into the same problem as you.  It was suggested that a _list
function would be an appropriate way to handle this but I don't know if
that's still the current thinking.

With this example I believe the database is still grabbing all the groups
and counting the number of rows, but at least it is doing the row count
before streaming all the results back to you...

http://stackoverflow.com/a/8142524/195125

function() {
 var count = 0;
 while(getRow()) count++;
 return JSON.stringify({count: count});
}


On Wed, Oct 15, 2014 at 5:17 PM, Gijs Nelissen <gijs@prezly.com> wrote:

> On Sat, Oct 11, 2014 at 9:23 PM, Sebastian Rothbucher <
> sebastianrothbucher@googlemail.com> wrote:
>
> > Thanks Aurélien, this is great! And indeed I think one does need the
> > contact_id as part of the key, otherwise there is no way of having
> > uniqueness. And as soon as it is part of the key, there is no reduce of
> > stuff belonging to different contact_ids. So the replacement for |sort
> > |uniq |wc -l in UNIX is a key with the sorting criterion plus a list
> > function ;-)  Again thanks, it helps me a lot also!!!!!
> >
>
> I tried to do it this way:
>
> function(doc) {
>     if (doc.contact.id && doc.email.id && doc.license.id && doc.release.id
> && doc.type.substring(0,6) == 'email_') {
>         var type =  doc.type;
>         emit([doc.email.id, type, doc.contact.id], null);
>     }
> }
>
> and a native count function.
>
> Now how do i get the number of unique opens ?
>
> query key -> [email.id, 'opens',[]] + group_level=3 and then count the
> number of results ?
> What am i missing here?
>
>
>
>
>
> >
> > On Fri, Oct 10, 2014 at 6:00 PM, Aurélien Bénel <aurelien.benel@utt.fr>
> > wrote:
> >
> > > Hi Gijs,
> > >
> > > > I have been trying different approaches to achieve this by using a
> > > custom map and reduce function.
> > >
> > > My rule of thumb is to avoid custom reduce functions at all cost.
> > > Maybe it's a bit harsh but it saved me a lot of time and frustration.
> > >
> > > >>> Now i want to do very mailchimp/campaignmonitor like summary per
> > > campaign (key[3}) that show nr of unique delivers, nr of unique opens,
> nr
> > > of unique clicks.
> > > > SELECT count(*) FROM events WHERE type='click' GROUP BY contact_id;
> > > > But i want the single view to output both the unique clicks, views
> and
> > > opens
> > >
> > >
> > > First, you should emit the following keys (in this order, with no
> value):
> > >
> > >     [campaign, type, contact_id]
> > >
> > > Then you can reduce those data (with any builtin reduce function,
> > `_count`
> > > for example) and `group=true` (which is shorter than
> > > `reduce=true&group_level=exact`).
> > >
> > > Then you'll need an other computation round to count unique contacts
> (per
> > > campaign and type). While waiting for chained map-reduce (coming soon I
> > > hope), you can cheat and do it with a list.
> > > With a list, it is usually a good idea to send the results as soon as
> you
> > > know them (i.e., in this case, when there is a new `type` in the
> current
> > > row or when there are no rows anymore).
> > >
> > >
> > > Regards,
> > >
> > > Aurélien
> > >
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message