lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: pieces missing in reusable analyzers?
Date Mon, 10 Aug 2009 22:36:15 GMT
> Then how do you notify the other filters that they should reset their state?
> TokenStream.reset()?  The javadoc specifies that it's actually used
> for something else - but perhaps it can be reused for this purpose?

Yonik, I did exactly this with several in lucene contrib.
For these i had to explicitly reset the filtered stream, and implement
reset() , or they would not do the right thing.

for example ThaiWordFilter inside ThaiAnalyzer...

      streams.source = new StandardTokenizer(reader);
      streams.result = new StandardFilter(streams.source);
      streams.result = new ThaiWordFilter(streams.result);
      streams.result = new StopFilter(streams.result,
StopAnalyzer.ENGLISH_STOP_WORDS_SET);
      setPreviousTokenStream(streams);
} else {
      streams.source.reset(reader);
      streams.result.reset(); // reset the ThaiWordFilter's state
}

-- 
Robert Muir
rcmuir@gmail.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message