lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yonik Seeley <yo...@lucidimagination.com>
Subject Re: pieces missing in reusable analyzers?
Date Mon, 10 Aug 2009 22:30:08 GMT
On Mon, Aug 10, 2009 at 6:21 PM, Earwin Burrfoot<earwin@gmail.com> wrote:
> I'm just keeping a reference to Tokenizer, so I can reset it with a
> new reader. Though this situation is awkward, TS definetly does not
> need a reset(Reader).

Then how do you notify the other filters that they should reset their state?
TokenStream.reset()?  The javadoc specifies that it's actually used
for something else - but perhaps it can be reused for this purpose?

I specifically used NGramTokenFilter in my example because it did use
internal state (and it's a bug that it has no way to reset that state
currently).

The way the new APIs work, TokenStream reusability has become a must,
but it doesn't look like the implementations or interfaces of all our
tokenizers and filters are currently up to the job.

-Yonik
http://www.lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message