Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 51020 invoked from network); 16 Oct 2010 17:57:49 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 16 Oct 2010 17:57:49 -0000 Received: (qmail 14855 invoked by uid 500); 16 Oct 2010 17:57:47 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 14668 invoked by uid 500); 16 Oct 2010 17:57:47 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 14660 invoked by uid 99); 16 Oct 2010 17:57:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 16 Oct 2010 17:57:47 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of wickedgrey@gmail.com designates 209.85.213.52 as permitted sender) Received: from [209.85.213.52] (HELO mail-yw0-f52.google.com) (209.85.213.52) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 16 Oct 2010 17:57:39 +0000 Received: by ywk9 with SMTP id 9so998491ywk.11 for ; Sat, 16 Oct 2010 10:57:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=ndZlwc2WT7gnYnFLUG1mH9se+gUsGrCicF/UXKyePhQ=; b=nPy5kX7GlW9PXghFkwt0yU/yjQ40/69T1TqRJdj6JH6P5XeY/NBQoU6koRZyn7f+s5 l/OZz822NLtfpTyB+iNmIFBmhmaYpGXS96TM1QL28CCaruhts5hYCzaLNB6q/exsi2Qo 8AdO1NwP4leZ5p7ZCK2EDrIdj/p6IiET3Z9CU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=aHC/Goife1gQFLCGzhotnXSs/4zXK50IbRVcQeO/KtiNTPLiJ/MJUr6Mso7nhzmPLF 34jBVoAZ8iCvR2Gk1DKMORL6majgQrLbTq/OmTpMeLU5ycRePyUV5Xf9pSMrPddMwKhJ WWRHCrz0I9PGOYoqkMXIqwGrptamHRMhSm5Ws= MIME-Version: 1.0 Received: by 10.151.146.16 with SMTP id y16mr3826733ybn.352.1287251837100; Sat, 16 Oct 2010 10:57:17 -0700 (PDT) Received: by 10.151.103.9 with HTTP; Sat, 16 Oct 2010 10:57:17 -0700 (PDT) In-Reply-To: References: <9FE98F86-B81E-4B5D-AEE9-07FF33ED201D@gmail.com> Date: Sat, 16 Oct 2010 10:57:17 -0700 Message-ID: Subject: Re: how to count the number of unique values From: "Eli Stevens (Gmail)" To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org Assuming you have the 'works' docs contain a type and a list of subject IDs (code untested, sorry): map = function(doc) { if (doc.type == 'work') { for (i in doc.subject_ids) { emit(doc.subject_ids[i], [doc._id]); // returning a list of a single doc._id makes it so that the reduce function is simpler; it's not required though. } } } reduce = function (key, values, rereduce) { var combinedList = []; for (i in values) { combinedList[combinedList.length] = values[i]; } return combinedList; } This produces a view with rows like: {key: 'subj_id1', value: ['work_id1', 'work_id2', ...]}, {key: 'subj_id2', value: ['work_id2', 'work_id3', ...]}, {key: 'subj_id3', value: ['work_id1', 'work_id4', ...]}, Does that help? Eli On Sat, Oct 16, 2010 at 8:04 AM, Anand Chitipothu wrote: > 2010/10/15 Wout Mertens : >> Just wanted to add that if you have a map function that emits (tag, 1) for each tag and then a reduce function that's just _count, you will have everything you need for painting a tag cloud. >> >> The view with group=true will list all tags exactly once, with their count. CouchDB doesn't tell you how many rows are in the result so you'll have to count them yourself. >> >> So you load that entire view in memory and you can draw the tags with their relative sizes. >> >> Wout. > > The example I gave is a rather simplified example. I'm working a data > containing 25M+ docs with books, works and subjects. I need to find > the list/count of works for each subject. I don't think it is > practical to load the view into memory to compute the required result. > > Anand >