couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Norman Barker <norman.bar...@gmail.com>
Subject Re: multiview on github
Date Tue, 21 Sep 2010 02:49:51 GMT
Bob,

thanks, that is interesting. I will checkout your code and see if I
can get it working, I wrote couchdb-clucene and am interested in a
lightweight text search for couchdb. I also liked your work with
ontylog, but I can't mix GPL with anything I am doing.

Norman

On Mon, Sep 20, 2010 at 7:22 PM, Robert Dionne
<dionne@dionne-associates.com> wrote:
> Norman,
>
>  Actually ontylog is GPL, and I wouldn't wish that code on anyone just yet. Think of
it as the contents of my /etc directory.
>
>  The indexer I'm chipping away at is just a proof of concept hacked up from Joe Armstrong's
Erlang book (with his permission). Anyone is welcome to use it that as they see fit, though
it does have restrictions from Armstrong press. It's been great for me to learn erlang and
explore the couch internals. It's also nice to have something nice and light running in couch.
>
>  My thoughts about plugins have nothing to do with licenses. I'd like the fact that
couchdb is simple and lean and more rock solid. I'm not sure multiview, geocouch, fti, or
any other indexers belong in the core. With multiview I think there's perhaps something more
general that might be part of core but I haven't given it a lot of thought yet.
>
> Cheers,
>
> Bob
>
>
>
>
> On Sep 20, 2010, at 7:02 PM, Norman Barker wrote:
>
>> Bob,
>>
>> I can see why plugins might work for you since your ontology /
>> indexing code is GPL, however I am more than happy for the multiview
>> to be apache licensed and would like to see it in trunk.
>>
>> I like the concept of plugins as it creates a stable API for third
>> parties, but I think a multiview is a core feature of CouchDB.
>>
>> Norman
>>
>> On Mon, Sep 20, 2010 at 4:19 AM, Robert Dionne
>> <dionne@dionne-associates.com> wrote:
>>> I see, neat.
>>>
>>> I ask because you might treat disjunction and conjunction  differently in terms
of whether you run around the ring or broadcast to all the nodes. For conjunctions you need
all to succeed so broadcast might fare better whereas for disjunctions only one need succeed.
I suppose it would depend largely on the number of views and the amount of each computation.
>>>
>>> Anyway I guess I have mixed feelings about seeing this in core. I see a lot of
folks already struggling to get their arms around working with map/reduce. It would make a
good plugin for advanced users. Actually the ability to have plugins is almost there now.
I have an indexer that only requires some ini file mods and getting the code on the classpath.
I think all that's needed at this point is:
>>>
>>> 1. conventions for a plugins directory
>>>
>>> 2. way of specing gen_servers in order to supervise them
>>>
>>> 3. some apis around some of the internals.
>>>
>>> I'm oversimplifying it for sure, the devils in the details and it's the kind
of thing programmers love to argue about ad nauseum but no one wants to do it (myself included
:)
>>>
>>> Best,
>>>
>>> Bob
>>>
>>>
>>>
>>> On Sep 19, 2010, at 10:22 AM, Norman Barker wrote:
>>>
>>>> Bob,
>>>>
>>>> it is just checking that a given id participates in a view, if it
>>>> makes it around the ring then it wins and gets streamed to the client,
>>>> adding disjoints would be fairly simple. Currently the only way I can
>>>> check if an id is in a view is to loop over the results of each view,
>>>> hence each node in the ring is in its own process to keep things
>>>> moving.
>>>>
>>>> A use case is two views, one that emits datetime (numeric) and another
>>>> view that emits values, e.g. A, B, C ..., the query would then be to
>>>> find the all documents with value A between start time and end time.
>>>>
>>>> Norman
>>>>
>>>> On Sun, Sep 19, 2010 at 5:21 AM, Robert Dionne
>>>> <dionne@dionne-associates.com> wrote:
>>>>> I took another peek at this and I'm curious as to what it's doing. Is
it just checking that a given id participates in a view? So if it makes it around the ring
it wins? Or is it actually computing the result of passing the doc thru all the views?
>>>>>
>>>>> If the answer is the former then would disjunction also be something
one might want? I'm just curious, I don't have a use case and I forget the original discussion
around this. I sort of think of views as a functional mapping from the database to some subset.
That's not entirely accurate given there's this reduce phase also. So I could imagine composing
views in a functional way, but the same thing can be had with just a different map function
that is the composition.
>>>>>
>>>>> Anyway if you have a brief description of this, with a use case,  it
would help.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Bob
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Sep 17, 2010, at 11:32 PM, Norman Barker wrote:
>>>>>
>>>>>> Chris, James
>>>>>>
>>>>>> thanks for bumping this, we are using this internally at 'scale'
>>>>>> (million+ keys). I want this to work for couchdb as we want to give
>>>>>> back for such a great product and support this going forward, so
any
>>>>>> suggestions welcomed and we will test and add them to the local github
>>>>>> account with the aim of getting this into trunk.
>>>>>>
>>>>>> Norman
>>>>>>
>>>>>> On Fri, Sep 17, 2010 at 7:00 PM, James Hayton <theboss@purplebulldog.com>
wrote:
>>>>>>> I want to use it!  I just haven't gotten around to it.  I was
going to try
>>>>>>> and test it out this weekend and if I am able, I will certainly
report back
>>>>>>> what I find.
>>>>>>>
>>>>>>> James
>>>>>>>
>>>>>>> On Fri, Sep 17, 2010 at 5:55 PM, Chris Anderson <jchris@apache.org>
wrote:
>>>>>>>
>>>>>>>> On Mon, Aug 30, 2010 at 10:58 AM, Norman Barker <norman.barker@gmail.com>
>>>>>>>> wrote:
>>>>>>>>> Bob,
>>>>>>>>>
>>>>>>>>> I can and have been testing the multiview at this scale,
it is ok
>>>>>>>>> (fast enough), but I think being able to test inclusion
of a document
>>>>>>>>> id in a view without having to loop would be a considerable
speed
>>>>>>>>> improvement. If you have any ideas let me know.
>>>>>>>>>
>>>>>>>>
>>>>>>>> I just want to bump this thread, as I think this is a useful
feature.
>>>>>>>> I don't expect to be able to test it in the coming weeks,
but if I did
>>>>>>>> I would. Is anyone besides Norman using this? Has anyone
used it at
>>>>>>>> scale?
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Chris
>>>>>>>>
>>>>>>>>> thanks,
>>>>>>>>>
>>>>>>>>> Norman
>>>>>>>>>
>>>>>>>>> On Mon, Aug 30, 2010 at 10:49 AM, Robert Newson <robert.newson@gmail.com>
>>>>>>>> wrote:
>>>>>>>>>> I'm sorry, I've had no time to play with this at
scale.
>>>>>>>>>>
>>>>>>>>>> On Mon, Aug 30, 2010 at 5:35 PM, Norman Barker <norman.barker@gmail.com>
>>>>>>>> wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> are there any more comments on this, if not can
you describe the
>>>>>>>>>>> process (in particular how to obtain a wiki and
jira account for
>>>>>>>>>>> couchdb which I have been unable to do) and I
will start documenting
>>>>>>>>>>> this so we can put this into the trunk.
>>>>>>>>>>>
>>>>>>>>>>> Bob, were you able to do any more testing with
large views, are there
>>>>>>>>>>> any suggestions on how to speed up the document
id inclusion test as
>>>>>>>>>>> described below?
>>>>>>>>>>>
>>>>>>>>>>> thanks,
>>>>>>>>>>>
>>>>>>>>>>> Norman
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Aug 23, 2010 at 9:22 AM, Norman Barker
<
>>>>>>>> norman.barker@gmail.com> wrote:
>>>>>>>>>>>> Bob,
>>>>>>>>>>>>
>>>>>>>>>>>> thanks for the feedback and for taking a
look at the code. Guidelines
>>>>>>>>>>>> on when to use a supervisor within couchdb
with a gen_server would be
>>>>>>>>>>>> appreciated, currently I have a supervisor
and a gen_server, but if
>>>>>>>>>>>> couchdb has a supervision process I could
remove that layer.
>>>>>>>>>>>>
>>>>>>>>>>>> I think plugins is a great idea, however
intersection of views is such
>>>>>>>>>>>> as common request, perhaps there needs to
plugin system and if a
>>>>>>>>>>>> plugin is rated enough it goes into trunk
as a core feature.
>>>>>>>>>>>>
>>>>>>>>>>>> the four (or slightly more) summary is here
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>> http://github.com/normanb/couchdb/raw/trunk/src/couchdb/couch_query_ring.erl
>>>>>>>>>>>>
>>>>>>>>>>>> %
>>>>>>>>>>>> % send an id from the start list to the next
node in the ring, if the
>>>>>>>>>>>> id is in adjacent node then the this node
sends to the next ring node
>>>>>>>>>>>> ....
>>>>>>>>>>>> % if the id gets all round the ring and back
to the start node then is
>>>>>>>>>>>> has intersected all queries and should be
included. The nodes in the
>>>>>>>>>>>> ring
>>>>>>>>>>>> % should be sorted in size from small to
large for this to be
>>>>>>>> effective
>>>>>>>>>>>> %
>>>>>>>>>>>> % In addition send the initial id list round
in parallel
>>>>>>>>>>>>
>>>>>>>>>>>> it really needs some eyes from the core couchdb
coders to see how to
>>>>>>>>>>>> speed up the inclusion testing, looping is
bad even if it is done in
>>>>>>>>>>>> parallel.
>>>>>>>>>>>>
>>>>>>>>>>>> Multiview is usable, I am using it with some
pretty big mega-views (as
>>>>>>>>>>>> per the raindrop) model, I am also available
to add features to this
>>>>>>>>>>>> as this is core part of our work and we want
to give it to couch as a
>>>>>>>>>>>> contribution.
>>>>>>>>>>>>
>>>>>>>>>>>> thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Norman
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Aug 23, 2010 at 5:05 AM, Robert Dionne
>>>>>>>>>>>> <dionne@dionne-associates.com> wrote:
>>>>>>>>>>>>> Hi Norman,
>>>>>>>>>>>>>
>>>>>>>>>>>>>  I took a peek at multiview. I haven't
followed this too closely on
>>>>>>>> the mailing list but this is *view intersection*? Is there
a 5 line summary
>>>>>>>> of what this does somewhere?
>>>>>>>>>>>>>
>>>>>>>>>>>>>  I'm curious as to why the daemon needs
to be a supervisor, most if
>>>>>>>> not all of the other daemons are gen_servers. OTP allows
this but I think
>>>>>>>> this is a good area where some CouchDB guidelines on plugins
would apply.
>>>>>>>>>>>>>
>>>>>>>>>>>>>  It strikes me that views, the use of
map/reduce, etc. are one of the
>>>>>>>> trickier aspects of using CouchDB, particularly for new users
coming from
>>>>>>>> the SQL world. People are also reporting issues with performance
of views, I
>>>>>>>> guess often because reduce functions go out of control.
>>>>>>>>>>>>>
>>>>>>>>>>>>>  I think the project would be better
served if features like this
>>>>>>>> were available as plugins. I would put GeoCouch in the same
category. Its
>>>>>>>> very neat and timely (given everyone wants to know where
everyone else is
>>>>>>>> using their telephone but without talking other than asynchronously),
but a
>>>>>>>> server plugin architecture that would allow this to be done
cleanly should
>>>>>>>> come first.
>>>>>>>>>>>>>
>>>>>>>>>>>>>  This is just my opinion. I'd love to
see some of the project
>>>>>>>> founders and committers weigh in on this and set some direction.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Bob
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Aug 22, 2010, at 5:45 PM, Norman Barker
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I would like to take this multiview
code and have it added to trunk
>>>>>>>> if
>>>>>>>>>>>>>> possible, what are the next steps?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Norman
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Aug 18, 2010 at 11:44 AM,
Norman Barker <
>>>>>>>> norman.barker@gmail.com> wrote:
>>>>>>>>>>>>>>> I have made
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://github.com/normanb/couchdb
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> which is a fork of the latest
couchdb trunk with the multiview code
>>>>>>>>>>>>>>> and tests added.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If geocouch is available then
it can still be used.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> There are a couple of questions
about the multiview on the user
>>>>>>>> /dev
>>>>>>>>>>>>>>> list so I will be adding some
more test cases during today.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Norman
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Aug 17, 2010 at 9:23
PM, Norman Barker <
>>>>>>>> norman.barker@gmail.com> wrote:
>>>>>>>>>>>>>>>> this is possible, I forked
geocouch since I use it, but I have
>>>>>>>> already
>>>>>>>>>>>>>>>> separated the geocouch dependencies
from the trunk.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I can do this tomorrow, certainly
be interested in any feedback.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Norman
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Aug 17, 2010 at 7:49
PM, Volker Mische <
>>>>>>>> volker.mische@gmail.com> wrote:
>>>>>>>>>>>>>>>>> On 08/18/2010 03:26 AM,
J Chris Anderson wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Aug 16, 2010,
at 4:38 PM, Norman Barker wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I have made the
changes as recommended, adding a test case
>>>>>>>>>>>>>>>>>>> multiview.js
and also adding the userCtx to open the db.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I have also forked
geocouch and this is available here
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> this patch seems
important (especially as people are already
>>>>>>>> asking for
>>>>>>>>>>>>>>>>>> help using it on
user@)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> to get it committed,
it either must remove the dependency on
>>>>>>>> GeoCouch, or
>>>>>>>>>>>>>>>>>> become part of CouchDB
when (and if) GeoCouch becomes part of
>>>>>>>> CouchDB.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Is it possible /
useful to make a version that doesn't use
>>>>>>>> GeoCouch? And
>>>>>>>>>>>>>>>>>> then to make the
GeoCouch capabilities part GeoCouch for now?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Norman,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> if the patch is ready
for trunk, I'd be happy to move the
>>>>>>>> GeoCouch bits to
>>>>>>>>>>>>>>>>> GeoCouch itself (as GeoCouch
isn't ready for trunk yet).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Lately I haven't been
that responsive when it comes to GeoCouch,
>>>>>>>> but that
>>>>>>>>>>>>>>>>> will change (in about
a month) after holidays and FOSS4G.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>  Volker
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Chris Anderson
>>>>>>>> http://jchrisa.net
>>>>>>>> http://couch.io
>>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>
>>>
>
>

Mime
View raw message