Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 8655 invoked from network); 7 Mar 2010 23:29:05 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 7 Mar 2010 23:29:05 -0000 Received: (qmail 996 invoked by uid 500); 7 Mar 2010 23:28:42 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 955 invoked by uid 500); 7 Mar 2010 23:28:42 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 947 invoked by uid 99); 7 Mar 2010 23:28:42 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 07 Mar 2010 23:28:42 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of randall.leeds@gmail.com designates 209.85.218.221 as permitted sender) Received: from [209.85.218.221] (HELO mail-bw0-f221.google.com) (209.85.218.221) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 07 Mar 2010 23:28:36 +0000 Received: by bwz21 with SMTP id 21so367712bwz.35 for ; Sun, 07 Mar 2010 15:28:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=fWcXAVaMPBWnTnd5xnT+0t4qEssz+e+LbiUFQflTuDc=; b=D+DlTJVH1JlO4nLG26gARhfCsWH1yPXiwei4a8T09vCiKRE3q2h6hY8fEQeYO9Zed0 DBVDDXfmz8WSju9yHkc2ldGINrbDe6IoPOCBo3AhNPcESzZCPG5v7yFEKVkFxz6+G3gn r/jamI0ZGVpO1vzV3EzjV9gJHGmTP2xOI80qo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=aMl9lqq13ZV9bzunQjZuYNzBsZqdEiXh7H8CWjLsKo9tBN9xGzntAdhDaR8Y2SGt8q 0LQxgIz5iAndQl/zJ4NIt3jLSCnt/rxKtgNItO1x61xLsYp8+mReZHvUSZdTq64c25lE m6biz7sW97Dhgn8LRuLQomf8o5/qI7rKDv8Zw= MIME-Version: 1.0 Received: by 10.204.9.23 with SMTP id j23mr4037477bkj.132.1268004494296; Sun, 07 Mar 2010 15:28:14 -0800 (PST) In-Reply-To: References: <6adfa88d1003071302x1a7e95b7k1ac303abd4a7a5f2@mail.gmail.com> Date: Sun, 7 Mar 2010 15:28:14 -0800 Message-ID: Subject: Re: Map reduce and weird output question From: Randall Leeds To: user@couchdb.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org I'm not an expert on this, but I think you need to create your own reduce function and output the number of keys rather than the sum of the values. On Sun, Mar 7, 2010 at 15:15, Gregory Tappero wrote: > Thank you Pawel, > > If i try to follow your way it gives me the count of docs in a given > day for each username, what i would like is the count of unique > usernames for a given day. > > function(doc) { > > =C2=A0 =C2=A0if (doc.doc_type=3D=3D"EdoPing" && doc.em_type=3D=3D0) { > =C2=A0 =C2=A0 =C2=A0 =C2=A0date =3D new Date().setRFC3339(doc.created_at)= ; > =C2=A0 =C2=A0 =C2=A0 =C2=A0emit([date.getFullYear(), parseInt(date.getMon= th())+1, > date.getDate(), doc.em_uname] , 1); > > =C2=A0 =C2=A0} > } > > Reduce: > =C2=A0_count > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > I get: > > [2010, 3, 3, "student1"] =C2=A0 =C2=A0 =C2=A0 =C2=A0 5 > [2010, 3, 4, "student1"] =C2=A0 =C2=A0 =C2=A0 =C2=A0 18 > [2010, 3, 5, "eong"] =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 77 > [2010, 3, 6, "bkante"] =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 71 > [2010, 3, 6, "jfrancillette"] =C2=A0 =C2=A072 > [2010, 3, 6, "mlouviers"] =C2=A0 =C2=A0 =C2=A0 =C2=A012 > [2010, 3, 7, "student1"] =C2=A0 =C2=A0 =C2=A0 =C2=A0 4 > > I would like to extract the following > > [2010, 3, 3] =C2=A0 =C2=A0 =C2=A0 1 > [2010, 3, 4] =C2=A0 =C2=A0 =C2=A0 1 > [2010, 3, 5] =C2=A0 =C2=A01 > [2010, 3, 6] =C2=A0 =C2=A0 =C2=A0 3 > [2010, 3, 7] =C2=A0 =C2=A0 =C2=A0 1 > > > if i do a group_level=3D3 it sum the values. > > {"key":[2010,3,3],"value":5}, > {"key":[2010,3,4],"value":18}, > {"key":[2010,3,5],"value":77}, > {"key":[2010,3,6],"value":155}, > {"key":[2010,3,7],"value":4} > > How can i count the unique username emitter per day ? > > > > > On Sun, Mar 7, 2010 at 10:02 PM, Pawe=C5=82 Stawicki wrote: >> Just emit all documents with em_type =3D 0 in map function, with [date, >> em_uname] as key. Then count in reduce. >> >> Map: >> function(doc) { >> =C2=A0if (doc.em_type =3D 0) { >> =C2=A0 =C2=A0//If you only want to count, you can emit anything (e.g. 1)= instead of >> doc here. >> =C2=A0 =C2=A0emit([date, em_uname], doc); >> =C2=A0} >> } >> >> Reduce: >> function(keys, values, rereduce) { >> =C2=A0if (!rereduce) { >> =C2=A0 =C2=A0return count_of_values; >> =C2=A0} else { >> =C2=A0 =C2=A0return sum_of_values; >> =C2=A0} >> >> =C2=A0//If you return 1 from emit instead of doc, then count_of_values = =3D=3D >> sum_of_values >> } >> >> Then you can handle everything by grouping: >> http://yourserver:5984/yourdb/_view/yourview?group_level=3D2 >> or group=3Dtrue >> >> Regards >> -- >> Pawe=C5=82 Stawicki >> http://pawelstawicki.blogspot.com >> http://szczecin.jug.pl >> >> >> >> On Sat, Mar 6, 2010 at 16:26, Gregory Tappero wrote: >> >>> Hello everyone, >>> >>> I have the following EdoPing 's type of documents >>> >>> { >>> =C2=A0 "_id": "22add509c1e7bc286832edc5bfe99ce5", >>> =C2=A0 "_rev": "1-49663ab8778f445e481143120d0d7086", >>> =C2=A0 "doc_type": "EdoPing", >>> =C2=A0 "em_uname": "student1", >>> =C2=A0 "em_gid": 1, >>> =C2=A0 "created_at": "2010-03-03T14:18:19Z", >>> =C2=A0 "em_ip": "92.154.70.148", >>> =C2=A0 "em_type": 0, >>> =C2=A0 "room_url": "z2fudcvcrfa3reaydatre", >>> =C2=A0 "room_users": [ >>> =C2=A0 =C2=A0 =C2=A0 "tutorsbox" >>> =C2=A0 ] >>> } >>> >>> i would like to count all unique em_uname of em_type 0 on a given day d= ate. >>> >>> For now i used this map/reduce >>> http://friendpaste.com/5xUUQ26bbl9d5KRB8eojwe >>> >>> Date.prototype.setRFC3339 =3D function(dString){ >>> =C2=A0 =C2=A0var regexp =3D >>> >>> /(\d\d\d\d)(-)?(\d\d)(-)?(\d\d)(T)?(\d\d)(:)?(\d\d)(:)?(\d\d)(\.\d+)?(Z= |([+-])(\d\d)(:)?(\d\d))/; >>> >>> =C2=A0 =C2=A0if (dString.toString().match(new RegExp(regexp))) { >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0var d =3D dString.match(new RegExp(regexp)); >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0var offset =3D 0; >>> >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0this.setUTCDate(1); >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0this.setUTCFullYear(parseInt(d[1],10)); >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0this.setUTCMonth(parseInt(d[3],10) - 1); >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0this.setUTCDate(parseInt(d[5],10)); >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0this.setUTCHours(parseInt(d[7],10)); >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0this.setUTCMinutes(parseInt(d[9],10)); >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0this.setUTCSeconds(parseInt(d[11],10)); >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0if (d[12]) >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0this.setUTCMilliseconds(parseF= loat(d[12]) * 1000); >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0else >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0this.setUTCMilliseconds(0); >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0if (d[13] !=3D 'Z') { >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0offset =3D (d[15] * 60) + pars= eInt(d[17],10); >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0offset *=3D ((d[14] =3D=3D '-'= ) ? -1 : 1); >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0this.setTime(this.getTime() - = offset * 60 * 1000); >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0} >>> =C2=A0 =C2=A0} else { >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0this.setTime(Date.parse(dString)); >>> =C2=A0 =C2=A0} >>> =C2=A0 =C2=A0return this; >>> }; >>> >>> var seenKeys =3D new Array(); >>> >>> function(doc) { >>> >>> >>> =C2=A0 =C2=A0if (doc.doc_type=3D=3D"EdoPing" && doc.em_type=3D=3D0) { >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0date =3D new Date().setRFC3339(doc.created_a= t); >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0var key =3D doc.em_uname + String(doc.create= d_at).substring(0,10); >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0if (seenKeys[key] =3D=3D =C2=A0undefined =C2= =A0) { >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0seenKeys[key] =3D 1; >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0emit([date.getFullYear(), pars= eInt(date.getMonth())+1, >>> date.getDate() ] , 1); >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 } >>> =C2=A0 =C2=A0} >>> } >>> >>> >>> It works when saved for this first time but as soon as new EdoPings >>> get added it starts emitting rows it has already seen ! (same key) >>> creating faulty count results. >>> >>> Is it ok to have seenKeys outside of the doc function() ? >>> What other way could i use to get the same results ? >>> >>> Thanks, >>> >>> Greg >>> >> > > > > -- > Greg Tappero > CTO co founder Edoboard > http://www.edoboard.com > +33 0645764425 >