Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 45621 invoked from network); 20 Sep 2010 23:03:18 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 20 Sep 2010 23:03:18 -0000 Received: (qmail 17450 invoked by uid 500); 20 Sep 2010 23:03:17 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 17228 invoked by uid 500); 20 Sep 2010 23:03:17 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 17216 invoked by uid 99); 20 Sep 2010 23:03:17 -0000 Received: from Unknown (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Sep 2010 23:03:17 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of norman.barker@gmail.com designates 74.125.82.180 as permitted sender) Received: from [74.125.82.180] (HELO mail-wy0-f180.google.com) (74.125.82.180) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Sep 2010 23:02:56 +0000 Received: by wyb40 with SMTP id 40so6851812wyb.11 for ; Mon, 20 Sep 2010 16:02:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=wKxjf48KY7yVon2zvaBXjAwZN+XpQ/GFdGEA9/6qwso=; b=D7HGwCut5RdYXTiHo3UgHByhMnPSHQW1O15xP0nUYR9Gu0XCkMhQCSqqXTSUagn53i mLt7MM5onugYsIcHAGpn3gBCKEzaimG0BTMA+NKYayskMamdCxXjXPq78teCClQkxDx5 MKXu55eo87lKfP/m0jvj+czdc8D+z4tgHseGY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=tdgbn9jaBbfj5um66Lg15NPjHzj8oCFo8lYYjAH/shaTcIWR9Zc4kMdxldLWWRnEvw JztSi4g87Rm/zZ8/u2WwA4vYggzCeAMzR+JgeLQhqlDAYe1tQg89nXNV2Talj569DcZj EHNgW3B9SCHbDTGUHFTighJVo0h+eWzfFQswc= MIME-Version: 1.0 Received: by 10.216.10.5 with SMTP id 5mr8438271weu.81.1285023755359; Mon, 20 Sep 2010 16:02:35 -0700 (PDT) Received: by 10.216.69.212 with HTTP; Mon, 20 Sep 2010 16:02:35 -0700 (PDT) In-Reply-To: <5DE14533-6DC8-41E2-A577-537E59416F41@dionne-associates.com> References: <96CEF573-64EE-44B2-AC41-2A5A312A141D@apache.org> <4835209A-F0C3-493E-8ED9-E58FEC4B1C55@apache.org> <4C6B3C3B.8020607@gmail.com> <5DE14533-6DC8-41E2-A577-537E59416F41@dionne-associates.com> Date: Mon, 20 Sep 2010 17:02:35 -0600 Message-ID: Subject: Re: multiview on github From: Norman Barker To: dev@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Bob, I can see why plugins might work for you since your ontology / indexing code is GPL, however I am more than happy for the multiview to be apache licensed and would like to see it in trunk. I like the concept of plugins as it creates a stable API for third parties, but I think a multiview is a core feature of CouchDB. Norman On Mon, Sep 20, 2010 at 4:19 AM, Robert Dionne wrote: > I see, neat. > > I ask because you might treat disjunction and conjunction =A0differently = in terms of whether you run around the ring or broadcast to all the nodes. = For conjunctions you need all to succeed so broadcast might fare better whe= reas for disjunctions only one need succeed. I suppose it would depend larg= ely on the number of views and the amount of each computation. > > Anyway I guess I have mixed feelings about seeing this in core. I see a l= ot of folks already struggling to get their arms around working with map/re= duce. It would make a good plugin for advanced users. Actually the ability = to have plugins is almost there now. I have an indexer that only requires s= ome ini file mods and getting the code on the classpath. I think all that's= needed at this point is: > > 1. conventions for a plugins directory > > 2. way of specing gen_servers in order to supervise them > > 3. some apis around some of the internals. > > I'm oversimplifying it for sure, the devils in the details and it's the k= ind of thing programmers love to argue about ad nauseum but no one wants to= do it (myself included :) > > Best, > > Bob > > > > On Sep 19, 2010, at 10:22 AM, Norman Barker wrote: > >> Bob, >> >> it is just checking that a given id participates in a view, if it >> makes it around the ring then it wins and gets streamed to the client, >> adding disjoints would be fairly simple. Currently the only way I can >> check if an id is in a view is to loop over the results of each view, >> hence each node in the ring is in its own process to keep things >> moving. >> >> A use case is two views, one that emits datetime (numeric) and another >> view that emits values, e.g. A, B, C ..., the query would then be to >> find the all documents with value A between start time and end time. >> >> Norman >> >> On Sun, Sep 19, 2010 at 5:21 AM, Robert Dionne >> wrote: >>> I took another peek at this and I'm curious as to what it's doing. Is i= t just checking that a given id participates in a view? So if it makes it a= round the ring it wins? Or is it actually computing the result of passing t= he doc thru all the views? >>> >>> If the answer is the former then would disjunction also be something on= e might want? I'm just curious, I don't have a use case and I forget the or= iginal discussion around this. I sort of think of views as a functional map= ping from the database to some subset. That's not entirely accurate given t= here's this reduce phase also. So I could imagine composing views in a func= tional way, but the same thing can be had with just a different map functio= n that is the composition. >>> >>> Anyway if you have a brief description of this, with a use case, =A0it = would help. >>> >>> Cheers, >>> >>> Bob >>> >>> >>> >>> >>> On Sep 17, 2010, at 11:32 PM, Norman Barker wrote: >>> >>>> Chris, James >>>> >>>> thanks for bumping this, we are using this internally at 'scale' >>>> (million+ keys). I want this to work for couchdb as we want to give >>>> back for such a great product and support this going forward, so any >>>> suggestions welcomed and we will test and add them to the local github >>>> account with the aim of getting this into trunk. >>>> >>>> Norman >>>> >>>> On Fri, Sep 17, 2010 at 7:00 PM, James Hayton wrote: >>>>> I want to use it! =A0I just haven't gotten around to it. =A0I was goi= ng to try >>>>> and test it out this weekend and if I am able, I will certainly repor= t back >>>>> what I find. >>>>> >>>>> James >>>>> >>>>> On Fri, Sep 17, 2010 at 5:55 PM, Chris Anderson w= rote: >>>>> >>>>>> On Mon, Aug 30, 2010 at 10:58 AM, Norman Barker >>>>>> wrote: >>>>>>> Bob, >>>>>>> >>>>>>> I can and have been testing the multiview at this scale, it is ok >>>>>>> (fast enough), but I think being able to test inclusion of a docume= nt >>>>>>> id in a view without having to loop would be a considerable speed >>>>>>> improvement. If you have any ideas let me know. >>>>>>> >>>>>> >>>>>> I just want to bump this thread, as I think this is a useful feature= . >>>>>> I don't expect to be able to test it in the coming weeks, but if I d= id >>>>>> I would. Is anyone besides Norman using this? Has anyone used it at >>>>>> scale? >>>>>> >>>>>> Cheers, >>>>>> Chris >>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Norman >>>>>>> >>>>>>> On Mon, Aug 30, 2010 at 10:49 AM, Robert Newson >>>>>> wrote: >>>>>>>> I'm sorry, I've had no time to play with this at scale. >>>>>>>> >>>>>>>> On Mon, Aug 30, 2010 at 5:35 PM, Norman Barker >>>>>> wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> are there any more comments on this, if not can you describe the >>>>>>>>> process (in particular how to obtain a wiki and jira account for >>>>>>>>> couchdb which I have been unable to do) and I will start document= ing >>>>>>>>> this so we can put this into the trunk. >>>>>>>>> >>>>>>>>> Bob, were you able to do any more testing with large views, are t= here >>>>>>>>> any suggestions on how to speed up the document id inclusion test= as >>>>>>>>> described below? >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Norman >>>>>>>>> >>>>>>>>> On Mon, Aug 23, 2010 at 9:22 AM, Norman Barker < >>>>>> norman.barker@gmail.com> wrote: >>>>>>>>>> Bob, >>>>>>>>>> >>>>>>>>>> thanks for the feedback and for taking a look at the code. Guide= lines >>>>>>>>>> on when to use a supervisor within couchdb with a gen_server wou= ld be >>>>>>>>>> appreciated, currently I have a supervisor and a gen_server, but= if >>>>>>>>>> couchdb has a supervision process I could remove that layer. >>>>>>>>>> >>>>>>>>>> I think plugins is a great idea, however intersection of views i= s such >>>>>>>>>> as common request, perhaps there needs to plugin system and if a >>>>>>>>>> plugin is rated enough it goes into trunk as a core feature. >>>>>>>>>> >>>>>>>>>> the four (or slightly more) summary is here >>>>>>>>>> >>>>>>>>>> >>>>>> http://github.com/normanb/couchdb/raw/trunk/src/couchdb/couch_query_= ring.erl >>>>>>>>>> >>>>>>>>>> % >>>>>>>>>> % send an id from the start list to the next node in the ring, i= f the >>>>>>>>>> id is in adjacent node then the this node sends to the next ring= node >>>>>>>>>> .... >>>>>>>>>> % if the id gets all round the ring and back to the start node t= hen is >>>>>>>>>> has intersected all queries and should be included. The nodes in= the >>>>>>>>>> ring >>>>>>>>>> % should be sorted in size from small to large for this to be >>>>>> effective >>>>>>>>>> % >>>>>>>>>> % In addition send the initial id list round in parallel >>>>>>>>>> >>>>>>>>>> it really needs some eyes from the core couchdb coders to see ho= w to >>>>>>>>>> speed up the inclusion testing, looping is bad even if it is don= e in >>>>>>>>>> parallel. >>>>>>>>>> >>>>>>>>>> Multiview is usable, I am using it with some pretty big mega-vie= ws (as >>>>>>>>>> per the raindrop) model, I am also available to add features to = this >>>>>>>>>> as this is core part of our work and we want to give it to couch= as a >>>>>>>>>> contribution. >>>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Norman >>>>>>>>>> >>>>>>>>>> On Mon, Aug 23, 2010 at 5:05 AM, Robert Dionne >>>>>>>>>> wrote: >>>>>>>>>>> Hi Norman, >>>>>>>>>>> >>>>>>>>>>> =A0I took a peek at multiview. I haven't followed this too clos= ely on >>>>>> the mailing list but this is *view intersection*? Is there a 5 line = summary >>>>>> of what this does somewhere? >>>>>>>>>>> >>>>>>>>>>> =A0I'm curious as to why the daemon needs to be a supervisor, m= ost if >>>>>> not all of the other daemons are gen_servers. OTP allows this but I = think >>>>>> this is a good area where some CouchDB guidelines on plugins would a= pply. >>>>>>>>>>> >>>>>>>>>>> =A0It strikes me that views, the use of map/reduce, etc. are on= e of the >>>>>> trickier aspects of using CouchDB, particularly for new users coming= from >>>>>> the SQL world. People are also reporting issues with performance of = views, I >>>>>> guess often because reduce functions go out of control. >>>>>>>>>>> >>>>>>>>>>> =A0I think the project would be better served if features like = this >>>>>> were available as plugins. I would put GeoCouch in the same category= . Its >>>>>> very neat and timely (given everyone wants to know where everyone el= se is >>>>>> using their telephone but without talking other than asynchronously)= , but a >>>>>> server plugin architecture that would allow this to be done cleanly = should >>>>>> come first. >>>>>>>>>>> >>>>>>>>>>> =A0This is just my opinion. I'd love to see some of the project >>>>>> founders and committers weigh in on this and set some direction. >>>>>>>>>>> >>>>>>>>>>> Best regards, >>>>>>>>>>> >>>>>>>>>>> Bob >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Aug 22, 2010, at 5:45 PM, Norman Barker wrote: >>>>>>>>>>> >>>>>>>>>>>> I would like to take this multiview code and have it added to = trunk >>>>>> if >>>>>>>>>>>> possible, what are the next steps? >>>>>>>>>>>> >>>>>>>>>>>> thanks, >>>>>>>>>>>> >>>>>>>>>>>> Norman >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Aug 18, 2010 at 11:44 AM, Norman Barker < >>>>>> norman.barker@gmail.com> wrote: >>>>>>>>>>>>> I have made >>>>>>>>>>>>> >>>>>>>>>>>>> http://github.com/normanb/couchdb >>>>>>>>>>>>> >>>>>>>>>>>>> which is a fork of the latest couchdb trunk with the multivie= w code >>>>>>>>>>>>> and tests added. >>>>>>>>>>>>> >>>>>>>>>>>>> If geocouch is available then it can still be used. >>>>>>>>>>>>> >>>>>>>>>>>>> There are a couple of questions about the multiview on the us= er >>>>>> /dev >>>>>>>>>>>>> list so I will be adding some more test cases during today. >>>>>>>>>>>>> >>>>>>>>>>>>> thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Norman >>>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Aug 17, 2010 at 9:23 PM, Norman Barker < >>>>>> norman.barker@gmail.com> wrote: >>>>>>>>>>>>>> this is possible, I forked geocouch since I use it, but I ha= ve >>>>>> already >>>>>>>>>>>>>> separated the geocouch dependencies from the trunk. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I can do this tomorrow, certainly be interested in any feedb= ack. >>>>>>>>>>>>>> >>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Norman >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, Aug 17, 2010 at 7:49 PM, Volker Mische < >>>>>> volker.mische@gmail.com> wrote: >>>>>>>>>>>>>>> On 08/18/2010 03:26 AM, J Chris Anderson wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Aug 16, 2010, at 4:38 PM, Norman Barker wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I have made the changes as recommended, adding a test cas= e >>>>>>>>>>>>>>>>> multiview.js and also adding the userCtx to open the db. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I have also forked geocouch and this is available here >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> this patch seems important (especially as people are alrea= dy >>>>>> asking for >>>>>>>>>>>>>>>> help using it on user@) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> to get it committed, it either must remove the dependency = on >>>>>> GeoCouch, or >>>>>>>>>>>>>>>> become part of CouchDB when (and if) GeoCouch becomes part= of >>>>>> CouchDB. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Is it possible / useful to make a version that doesn't use >>>>>> GeoCouch? And >>>>>>>>>>>>>>>> then to make the GeoCouch capabilities part GeoCouch for n= ow? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Norman, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> if the patch is ready for trunk, I'd be happy to move the >>>>>> GeoCouch bits to >>>>>>>>>>>>>>> GeoCouch itself (as GeoCouch isn't ready for trunk yet). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Lately I haven't been that responsive when it comes to GeoC= ouch, >>>>>> but that >>>>>>>>>>>>>>> will change (in about a month) after holidays and FOSS4G. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>> =A0Volker >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Chris Anderson >>>>>> http://jchrisa.net >>>>>> http://couch.io >>>>>> >>>>> >>> >>> > >