Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 33961 invoked from network); 10 Dec 2010 14:28:57 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 10 Dec 2010 14:28:57 -0000 Received: (qmail 85140 invoked by uid 500); 10 Dec 2010 14:28:55 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 84823 invoked by uid 500); 10 Dec 2010 14:28:55 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 84810 invoked by uid 99); 10 Dec 2010 14:28:54 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Dec 2010 14:28:54 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of robert.newson@gmail.com designates 209.85.216.180 as permitted sender) Received: from [209.85.216.180] (HELO mail-qy0-f180.google.com) (209.85.216.180) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Dec 2010 14:28:44 +0000 Received: by qyk29 with SMTP id 29so3460097qyk.11 for ; Fri, 10 Dec 2010 06:28:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=Qj7NQZ7X4XKG51aaHC6kjpKnzLAnM8/lbAkxhfKnLWg=; b=WzWijn/jbTJQ204ik5zNK3N4SAr8IscpLmTdfzMpc8EVGQzSzRBdT5E4PW+WVaYqTh 1vDq6XvGT2D6984R1C3dr7BIduTcfn4iD6nEMiZDCz33ytwWpXnCCUhRmi4+OPBIFVOs irQGKhqFs/kSwZU2w2XUrPLEGFhbxNQ4SJf78= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=SHTLJzq6edG9MRsJpeNsJlu0vJRBzSWKjc3VeJMW+8UABfYVTGvO3ISS2vKJgPcO8F D3C4IwGYX5g8JyMJazG3+QlyRtxFfJURAVCZ41TP8r4cwHoJakY2CDC4XMvqr36PMuwD 3Pu1R7hsOyN6uYKs8rJQqvXnsrVd4VKCTYaRw= MIME-Version: 1.0 Received: by 10.229.88.146 with SMTP id a18mr765804qcm.60.1291991302951; Fri, 10 Dec 2010 06:28:22 -0800 (PST) Received: by 10.220.176.137 with HTTP; Fri, 10 Dec 2010 06:28:22 -0800 (PST) In-Reply-To: <4D023459.9040505@fiset.ca> References: <4D023459.9040505@fiset.ca> Date: Fri, 10 Dec 2010 14:28:22 +0000 Message-ID: Subject: Re: Suppressing duplicate documents from a view query From: Robert Newson To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org "I seem to remember that there is a request flag that can be passed to a view query to eliminate document duplicates." I've never heard of such an option. To remove duplicate keys from a view, use a reduce function like "return null" and query with reduce=3Dtrue. B. On Fri, Dec 10, 2010 at 2:08 PM, Jean-Pierre Fiset wrote: > This question relates to CouchDb 1.0.1. > > I have been reading as much as I can about CouchDb and started implementi= ng an application. I > seem to remember that there is a request flag that can be passed to a vie= w query to eliminate > document duplicates. However, I do not recall which one it is. A quick re= view of the wiki did > not shed any light. > > Currently, I am removing duplicates by using a list function. I have docu= mented the details of > the list function here: > http://www.bitsbythepound.com/remove-document-duplicates-from-couchdb-vie= w-query-using-a-list-function-366.html > > In short, this is the list function: > > function(head, req) { > =A0 =A0send('{"total_rows":'+head.total_rows > =A0 =A0 =A0 =A0+',"offset":'+head.offset+',"rows":['); > =A0 =A0var ids =3D {} > =A0 =A0 =A0 =A0,row > =A0 =A0 =A0 =A0,first =3D true > =A0 =A0 =A0 =A0; > =A0 =A0while(row =3D getRow()) { > =A0 =A0 =A0 =A0if( !ids[row.id] ) { > =A0 =A0 =A0 =A0 =A0 =A0if( first ) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0first =3D false; > =A0 =A0 =A0 =A0 =A0 =A0} else { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0send( ',' ); > =A0 =A0 =A0 =A0 =A0 =A0}; > =A0 =A0 =A0 =A0 =A0 =A0send( toJSON(row) ); > =A0 =A0 =A0 =A0 =A0 =A0ids[row.id] =3D 1; > =A0 =A0 =A0 =A0}; > =A0 =A0}; > =A0 =A0send(']}'); > } > > Questions: > 1. Is there a much simpler way to achieve this? > 2. If the list function is the way to solve the problem, can someone comm= ent on the likelihood > that this function will sustain the scaling up of the database? > > Thanks, > > JP >