Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 42579 invoked from network); 17 Nov 2010 17:30:35 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 17 Nov 2010 17:30:35 -0000 Received: (qmail 37030 invoked by uid 500); 17 Nov 2010 17:31:03 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 36980 invoked by uid 500); 17 Nov 2010 17:31:03 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 36936 invoked by uid 99); 17 Nov 2010 17:31:03 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Nov 2010 17:31:03 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ryan.ramage@gmail.com designates 74.125.82.54 as permitted sender) Received: from [74.125.82.54] (HELO mail-ww0-f54.google.com) (74.125.82.54) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Nov 2010 17:30:58 +0000 Received: by wwi17 with SMTP id 17so432245wwi.23 for ; Wed, 17 Nov 2010 09:30:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=Dn3ca77nUtjbInycnQUcKNHgPaRGkjI1ZTKSa1mQvag=; b=RtTzFz6uYUZnGudWOxOaeHwRIP9buBnr7Inzk4lOhQTtimyeyJvZQbOgzOFE4lprVs Mj4EsynSP4UIBDeEh4ITiseOT0wue3JcbmgMHg52ciw2sd9aBqHMWQsugHbyJeAchg8O DSnqpakcgPxirB+bZxc3Uz48tTtKFVEBMwmWE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=dsHHaoHst4LXsOkA+SXigoIGGgPCjjluZSiKB7VUf05SCK5VeOP9/2u+OSoE3wxhra 24wTN5U1mNdStrKg4wwz9dyIXWbVR+jk41Y/T+h62HiAYrMUrMGxT/W3/XSTfJdjcB8N ICxusd+S6jdfjYA1IdBSAMMEQrep4C8Qbauys= MIME-Version: 1.0 Received: by 10.227.154.213 with SMTP id p21mr9588160wbw.219.1290015036190; Wed, 17 Nov 2010 09:30:36 -0800 (PST) Received: by 10.227.136.209 with HTTP; Wed, 17 Nov 2010 09:30:36 -0800 (PST) In-Reply-To: References: <76A109FD-9829-4EAA-9BA1-0FAC29357EA9@apache.org> <7D7C2F35-4630-494D-BD39-C446FCB3486E@apache.org> Date: Wed, 17 Nov 2010 10:30:36 -0700 Message-ID: Subject: Re: Forcing document reindex From: Ryan Ramage To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 What about a list function? You can access the request to get query parameters (eg id needed for the report?) and then you can iterate through the docs and build up your relationship. I dont know how fast it will be with the number of docs you have, but it will be liner time (order n). http://guide.couchdb.org/editions/1/en/lists.html http://guide.couchdb.org/editions/1/en/lists.html On Wed, Nov 17, 2010 at 9:13 AM, Nicolas Jessus wrote: > All right; no one should like what they're going to read. > > I have a medium-sized MySQL system, which translates to a Couch with about a > million documents of about 20 types. The system would really benefit from a > schema-free design. The data is only weakly relational. Couch would fit really > well, enough that I don't mind twisting its arm in a few places if need be; the > tradeoff would be worth it. > > The hiccup is reporting. Some of it involves the full set of documents. Let's > say I have 5 categories of documents involved in a report, A to E. A links to B, > B links to C, etc. The report needs data from A, B, and E. As far as I can > think, there's no way to do a view collation, because A and B share an ID but E > doesn't. I can't pull a million documents from the DB to process elsewhere > either, so that nixes simple indexing and the '_id' object values. > > I could however write a special view_server that will emit keys after checking > the linked ID through an HTTP call (that's where you scream). Indexing > performance is totally unimportant to me, DB updates are relatively few, and I > can live with the dirty side-effects (again, the system as a whole would still > be much cleaner than the MySQL one). > > With that solution I can have a map function that just handle docs of type A. > But I still need to reindex the relevant As when B or E changes. I could simply > listen to the change stream and force a reindex, but that doesn't work well with > legitimate updates when the _rev number goes up at random even though the doc > hasn't changed, and there's no auto-merge. So I'm pretty stuck. > > I'm not asking that this type of functionality be encouraged. It's clearly > subverting the point of Couch. On the other hand, it doesn't seem like having a > force-reindex function would dirty the concept, and if it's easy to code, then > it's a shame it doesn't exist. > >