lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: StandardAnalyzer .. stemming
Date Fri, 17 Feb 2006 19:47:04 GMT

: The SnowBallAnalyzer seems to offer stemming. The StandardAnalyzer on
: the other hand has a bunch of other niceness. What is the best practice
: of leveraging both these analyzers while indexing and searching? Do I
: chain these up somehow and if so what apis do i look at for doing so? Do
: i implement my own analyzer and use both these two process the tokens?

the Analyzer class is already designed to making chaining very easy -- but
not Analyzer chaining, TokenFilter chaining.

if you take a look at the source for StandardAnalyzer and SnowBallAnalyzer
it should (hopefully) be very obvious how to write your own (10 line or
less) Analyzer that gives youall the goodness you want from both...

http://svn.apache.org/viewcvs.cgi/lucene/java/trunk/src/java/org/apache/lucene/analysis/standard/StandardAnalyzer.java?rev=219090&view=markup
http://svn.apache.org/viewcvs.cgi/lucene/java/trunk/contrib/snowball/src/java/org/apache/lucene/analysis/snowball/SnowballAnalyzer.java?rev=151459&view=markup

...if you literaly just want to add snowball stemming to
the end of StandardAnalyzer, then i *think* something like this would
work...

   Analyzer a = new StandardAnalyzer(stoplist) {
     public TokenStream tokenStream(String fieldName, Reader reader) {
       return new SnowballFilter(super.tokenStream(fieldName,reader),
                                 "yourChoiceOfStemmerName");
     }
   }


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message