lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yonik Seeley <>
Subject Re: pieces missing in reusable analyzers?
Date Mon, 10 Aug 2009 22:30:08 GMT
On Mon, Aug 10, 2009 at 6:21 PM, Earwin Burrfoot<> wrote:
> I'm just keeping a reference to Tokenizer, so I can reset it with a
> new reader. Though this situation is awkward, TS definetly does not
> need a reset(Reader).

Then how do you notify the other filters that they should reset their state?
TokenStream.reset()?  The javadoc specifies that it's actually used
for something else - but perhaps it can be reused for this purpose?

I specifically used NGramTokenFilter in my example because it did use
internal state (and it's a bug that it has no way to reset that state

The way the new APIs work, TokenStream reusability has become a must,
but it doesn't look like the implementations or interfaces of all our
tokenizers and filters are currently up to the job.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message