Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 71202 invoked from network); 11 Jul 2010 14:58:00 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 11 Jul 2010 14:58:00 -0000 Received: (qmail 33968 invoked by uid 500); 11 Jul 2010 14:57:58 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 33917 invoked by uid 500); 11 Jul 2010 14:57:58 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 33909 invoked by uid 99); 11 Jul 2010 14:57:57 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 11 Jul 2010 14:57:57 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jchris@gmail.com designates 209.85.212.180 as permitted sender) Received: from [209.85.212.180] (HELO mail-px0-f180.google.com) (209.85.212.180) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 11 Jul 2010 14:57:50 +0000 Received: by pxi3 with SMTP id 3so2610643pxi.11 for ; Sun, 11 Jul 2010 07:57:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:content-type:mime-version :subject:from:in-reply-to:date:content-transfer-encoding:message-id :references:to:x-mailer; bh=UtPXSyioTQ8pLecPBchQBVp4H0iRlEgmi2k58eO5ILc=; b=CiVRrEMBY4bAkS7fK5yF72DJxAGMkA3oKhycBTGT19wz/hiheckM2T4sHwh6Cx1io1 wpAEd7HPfOM3UFZwxL77iBQH4ZMZ/ff9gkFHQlqQTM8n9d71WGmgGQi3bDKvhCriwu63 tOHzXeGX5b6hBbBWqsbk2t4I6zpvocadN8K/w= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer; b=hrlngTD2ic+tOuMGj+DxPOkwXzXLHj74jvNTGjXlek08TtyHcfl9j6MSj7Jmq+z4wj bE83vg7rDVOqiV4Tzd4LBPQpRg+M+/sh8RoLGURBKj6/jCrmh7LyLghk511MV7x/TYaI BRkGZVrsKPI1hECibpDOte/vpKm8W6XPF5tRY= Received: by 10.142.192.2 with SMTP id p2mr957737wff.288.1278860248826; Sun, 11 Jul 2010 07:57:28 -0700 (PDT) Received: from dhcp184-48-36-48.mwc.sjc.wayport.net (dhcp184-48-36-48.mwc.sjc.wayport.net [184.48.36.48]) by mx.google.com with ESMTPS id f2sm3672912wfp.11.2010.07.11.07.57.26 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sun, 11 Jul 2010 07:57:27 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1081) Subject: Re: querying multiple views From: J Chris Anderson In-Reply-To: Date: Sun, 11 Jul 2010 07:57:23 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: References: To: user@couchdb.apache.org X-Mailer: Apple Mail (2.1081) X-Virus-Checked: Checked by ClamAV on apache.org On Jul 11, 2010, at 7:27 AM, Norman Barker wrote: > Afshin >=20 > I have got the all clear from my work to release this as a patch, I > expect to be putting something up on github by the end of the week > (internal paperwork permitting). I am going to implement it as an > external handler so it can be used and reviewed and from there it will > be under a do what you want with it license so it can go into couchdb > if accepted. >=20 If you're implementing in Erlang, you probably don't need to go full hog = as an external. Because the configuration system is so modular, you are probably best = adding it as a new httpd_design_handler, or httpd_db_handler. It should be easy to create a new module and link it in via the = configuration file. I don't know if you plan to allow querying across databases (I'd suggest = restricting the queries to a single database if you want to stay within = CouchDB's security model, and have it more likely to be accepted as a = patch.) We should really move this discussion to dev@ -- a lot of the developers = only give a cursory glance at the user list, so you will get more = valuable feedback there. Thanks for taking the time to write and release the patch! Chris > Chris, thanks for the help with the reduce function and confirming the = concept! >=20 > Norman >=20 > On Sun, Jul 11, 2010 at 8:02 AM, afshin afzali = wrote: >> Hi Norman, Chris >>=20 >> I just wanted to say this is the same problem we are currently facing >> with. We are implementing a Local Business Directory application on >> couchdb. Our searches need to combine several keys together to find >> right entries. To do something like that in server side, we had the >> paging mechanism problem, so we have chosen that do Norman's = algorithm >> in client side! I'll appreciate if there will be a successful = progress >> in this issue. >>=20 >> BEST, >> -- afshin >>=20 >> On 7/8/10, J Chris Anderson wrote: >>>=20 >>> On Jul 8, 2010, at 10:43 AM, Norman Barker wrote: >>>=20 >>>> Hi, >>>>=20 >>>> I have been thinking about how to query multiple views at one time. >>>>=20 >>>> I have an erlang handler in couchdb that takes a http post = containing >>>> N view queries, each query contains a startkey and an endkey, I = then >>>> open up each view in parallel (using pmap) and accumulate the doc = ids, >>>> then I use the erlang sets module to get the unique values. All = good >>>> and looks pretty (and works), though it doesn't scale since I am >>>> holding all the results on the server (potential memory overload!) >>>> whereas I would like to stream the results to the client one by = one. >>>>=20 >>>> I am thinking of doing the following but have some questions; >>>>=20 >>>> My first question is when I do >>>>=20 >>>> couch_view:fold(View, FoldlFun, FoldAccInit, >>>> couch_httpd_view:make_key_options(Args)), >>>>=20 >>>> is there a way to call the _count reduce function in code to find = the >>>> number of rows in the slice between startkey and endkey? >>>>=20 >>>> If so, I would like to order all the views in the posted query >>>> document by the result of _count from smallest to largest. >>>>=20 >>>> I would then fold over the smallest result view and pull each = document >>>> id (*) in turn. >>>>=20 >>>> With each document id I would then call each of the other views in >>>> turn with their startkey and endkey and in addition include >>>> startkey_docid and endkey_docid with the docid in * above, again >>>> calling _count I can test for inclusion. If the doc id is in all = views >>>> then I will immediate stream this to the client. >>>>=20 >>>> Am I doing something stupid, is this optimal? >>>>=20 >>>=20 >>> It sounds like you are on the right track. this could be a very = valuable >>> patch to CouchDB once you have it working. >>>=20 >>>> Any help with the programmatic _count call would be great. >>>>=20 >>>=20 >>> One hint: maybe the call to reduce_to_count will help. >>>=20 >>> Here's an implementation of a reduce query in Erlang. >>>=20 >>> http://github.com/jchris/hovercraft/blob/master/hovercraft.erl#L217 >>>=20 >>> Sorry I can't be more helpful. I've successfully bootstrapped this = stuff in >>> my head before, but it always takes a couple of hours of turning my = brain >>> into a step debugger. >>>=20 >>> Good luck! >>>=20 >>> Once you get deeper into the code you might have better luck getting >>> responses on the dev@ list or maybe the #couchdb IRC channel on = freenode. >>>=20 >>> Chris >>>=20 >>>=20 >>>> thanks, >>>>=20 >>>> Norman >>>=20 >>>=20 >>=20