lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walter Underwood <wunderw...@netflix.com>
Subject Re: token concat filter?
Date Thu, 01 May 2008 17:53:54 GMT
I've been doing it with synonyms and I have several hundred of them.

Concatenating bi-word groups is pretty useful for English. We have a
habit of gluing words together. "database" used to be two words.
Dictionaries still think it should be "web server".

wunder

On 5/1/08 10:47 AM, "Geoffrey Young" <geoff@modperlcookbook.org> wrote:
> 
> Yonik Seeley wrote:
>> If there are only a few such cases, it might be better to use synonyms
>> to correct them.
> 
> unfortunately, there are too many to handle this way.
> 
>> Off the top of my head there's no concatenating token filter, but it
>> wouldn't be hard to make one.
> 
> hmm, ok.  I'm not a java guy, so I'll try the PatternTokenizerFactory
> before trying to write my own.  thanks :)
> 
> speaking of synonyms... will changes to synonyms.txt (and the other
> files) take affect on each re-indexing, or does the solr server read it
> once on load then hold on to it until restart?
> 
> --Geoff


Mime
View raw message