incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Reverri" <reve...@gmail.com>
Subject Re: the search api?
Date Sun, 27 Jul 2008 23:04:44 GMT
Dean,

Any chance you want to share your view code?

In regards to the query parsing, I am not sure how this will work. Right now
results for each term have to be pulled down to the client and merged
together. Perhaps we could add a query method to views that allow different
key values to be combined.

A user could query a view with a set of keys and a merge function that could
define how the key values could be combined.

On Fri, Jul 25, 2008 at 5:01 PM, Dean Landolt <dean@deanlandolt.com> wrote:

> On Mon, Jul 21, 2008 at 11:45 AM, Dean Landolt <dean@deanlandolt.com>
> wrote:
>
> > On Mon, Jul 21, 2008 at 1:08 AM, Dan Reverri <reverri@gmail.com> wrote:
> >
> >> Is it worthwhile to implement a full text indexer on top of couchdbs
> >> map/reduce functionality?
> >>
> >> http://wiki.apache.org/couchdb/FullTextIndexWithView
> >>
> >
> >
> > Interesting idea. There's definitely more to FTI than tokenization alone,
> > but then again there's an awful lot of power in m/r and javascript -- it
> > didn't take me a second to find a porter stemming algorithm in js:
> > http://tartarus.org/~martin/PorterStemmer/js.txt<http://tartarus.org/%7Emartin/PorterStemmer/js.txt>
> <http://tartarus.org/%7Emartin/PorterStemmer/js.txt>
> >
> > I bet variable weighting would be pretty close to impossible in the m/r
> > paradigm though, and probably some other features (of course, I could be
> > wrong, and when it comes to couchdb, thus far I usually am). For a
> strait-up
> > word search, this is servicible as is. I'm going to see if I can't figure
> > out how to shoehorn in some boolean features.
> >
>
> I gave this approach another look and I was able to get a view together
> that
> did a little more (stemming, optional case-insensitivity, min length for
> tokens, better whitespace handling). I'm working on an ngram view too and
> so
> far it's promising. But there's still one huge problem -- for the life of
> me
> I can't figure out a workable strategy for boolean operations that doesn't
> involve fully loading each piece of the query. Am I missing something? Is
> something like this even possible? I know there's no way to load a piece of
> a view from another view -- but I just can't help but really wish there
> were.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message