lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Arslan <iori...@yahoo.com>
Subject Re: read more tokens during analysis
Date Fri, 12 Feb 2010 17:08:52 GMT

> i want to consider the current word
> & the next as a single term.
> 
> when analyzing "Arun Kumar"
> 
> i want my analyzer to consider "Arun",  "Arun Kumar"
> as synonyms.
> 
> in the tokenstream method, how do we read the next token
> "Kumar"
> i am going through the setPositionIncrements method for
> considering them as
> synonyms, but i don't understand how to implement look
> ahead in the
> analyzer.

Can we say that you want to implement a synonym filter that takes a list of custom synonyms?
If yes why not use Solr's SynonymFilterFactory[1] that does this automatically? It can handle
multi-words synonym like "Arun",  "Arun Kumar"
I can share the code to integrate it into Lucene if you want.

[1]http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory



      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message