lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tsuraan <>
Subject Re: Customer TokenFilter
Date Thu, 27 May 2010 21:37:40 GMT
> Looks correct! Wrapping by CharBuffer is very intelligent! In Lucene 3.1 the
> new Term Attribute will implement CharSequence, then its even simplier. You
> may also look at 3.1's ICU contrib that has support even for Normalizer2.

Ok, I've only been looking at 3.0.1 so far; I'll check out the 3.1
work and see what's there.

> Overriding StandardAnalyzer is the wrong way, as in 3.1 its final (its only
> accidentially non-final)! You should always extend Analyzer and build the
> chain yourself. Or wrap another analyzer like in PerFieldAnalyzerWrapper.
> The whole Tokenization API in Lucene is based on the delegator pattern, so
> never extend any class not made for it!

Ok, I've changed it to just borrowing the guts of the StandardAnalyzer
rather than inheriting.  I also think I got the reuasableTokenStream
method correct this way, so that's a bonus.  Care to have another look
at it?

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message