incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Anderson" <jch...@gmail.com>
Subject Re: view intersections? (and Full Text Indexing)
Date Sun, 11 Jan 2009 23:17:13 GMT
On Sun, Jan 11, 2009 at 2:49 PM, Jeff Hinrichs - DM&T
<dundeemt@gmail.com> wrote:
> On Sun, Jan 11, 2009 at 2:17 PM, Dean Landolt <dean@deanlandolt.com> wrote:
>> On Sun, Jan 11, 2009 at 2:11 PM, Jeff Hinrichs - DM&T <jeffh@dundeemt.com>wrote:
>>
>>> I've been reading and googling trying to figure out the proper way to
>>> do an intersection of views.
>>>
>>> The database has documents with an attribute of tags (a list)
>>> ['copper','blue','hot','long','twisted']
>>>
>>>
>>> If I wanted to find all documents that had the tags of 'copper' and
>>> 'blue' what is the preferred way?  I could index all the elements of
>>> the tags list and then perform two requests key='copper' and
>>> key='blue' and have the client do the intersection.  Is there a way to
>>> have couchdb do the lifting on this one?
>>>
>>> Along the same line, what of the union of tags. 'copper' or 'blue'
>>>
>>
>> Two requests, then merge on the client. It's not really a pattern that fits
>> the map/reduce paradigm well. I don't know the status of the fti
>> integration, but once that goes down there should be more efficient ways to
>> handle this.
>>
> True enough that it doesn't fit the map/reduce paradigm, but the
> intersection would be performed post map/reduce. ?? Just like
> key/startkey/endkey/limit are not part of the map/reduce picture, they
> appear to be implemented separately to operate on the results of the
> view created by map/reduce.   This would be an enhancement to querying
> the view, not in generating the view itself.
>
> Feel free to slap me down, as I'm talking from someone who has not
> looked at the source, is fairly new to couch and is talking from
> limited experience(I've got thick skin and a desire to learn :).
> However,  set intersections and to a lesser degree, joins, are a
> common and useful idiom especially when working with sets.  And views
> are sets, couchdb already supports limited set operations by giving
> simple sub-select operations.  The reason, I'm guessing, is do to the
> natural idiom and freeing the client from doing the work.
> Intersections are a natural progression of view querying, in my
> opinion.
>
> trying-to-get-someone-else-to-do-my-work'ly,
> -Jeff
>


Jeff,

There've been a few discussions about view intersections on the user
and dev lists over the last few months. I think the current state is
that we think that it's part of the Full Text Indexing requirements,
and a solvable problem, but pretty hairy.

There are a few github branches and suchlike with fulltext support.
One school of thought is to integrate with a tried and true indexer
like Lucene, another is to build an Erlang FTI, perhaps using Couch's
storage model. I think both options will require the same view
intersection capabilities. Once they are available for FTI, it should
be a small step to define an interface for POSTing query intersection
jobs to Couch.

It's also not hard to prototype something like this using the External
module, which lets your write in the language of your choice.

-- 
Chris Anderson
http://jchris.mfdz.com

Mime
View raw message