couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <mar...@buyways.nl>
Subject Re: couchdb-lucene : stemming analyzer configuration question
Date Wed, 17 Feb 2010 15:29:07 GMT
Hi,


It's per the documentation quite unclear which tokenizers are being used for 
each analyzer. However, the readme states that the standardanalyzer only uses 
the LowerCaseFilterFactory, StopFilterFactory but there is no tokenizer 
mentioned.

I assume that it uses a simple WhiteSpaceTokenizer which does not with grams 
(substrings). You can either query using a wildcard, which is supported AFAIK, 
or make an attempt to specifiy your own tokenizer, perhaps creating a custom 
analyzer.

Either way, searching for grams can be done using and NGramTokenizer.


Cheers,


>Hello,
>I am trying to support fulltext search with CouchDB-Lucene.
>
>I am using CouchDB 0.10.0 and couchdb-lucene 0.4 on Windows XP.
>
>I am able to query a word, but not able to match partial word. For example,
> I have a 'name' field with a value 'alex'. I can query the documents if I
> use 'q=alex'. But I am not able to get any documents if I use 'q=a'.
>
>I suspect that this is because the default StandardAnalyzer does not support
> this. How should I config a different analyzer to support this ? rgds,
>canal

Markus Jelsma - Technisch Architect - Buyways BV
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350


Mime
View raw message