lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: SnowballAnalyzer question
Date Fri, 22 Aug 2008 17:30:53 GMT


: I am using the SnowballAnalyzer because of it's multi-language stemming
: capabilities - and am very happy with that.
: There is one small glitch which I'm hoping to overcome - can I get it to split
: up internet domain names in the same way that StopAnalyzer does?

90% of the Lucene Analyzers that exist tend to be simple wrappers arround 
Tokenizers and TokenFilters -- this is true for SnowballAnalyzer and 
StopAnalyzer as well -- all those classes do is setup some initialization 
work, and then delegate to various Tokenizers and TokenFilters ... if you 
poke arround in the code for SnowballAnalyzer you'll see that you can 
write your own analyzer that uses SnowballFilter along with whatever 
tokenizer you want.  (if you like StopAnalyzer's tokenization, that would 
be LowerCaseTokenizer)




-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message