lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: Analyzer forcing tokenStream and reusableTokenStream to be final
Date Tue, 19 Oct 2010 15:21:09 GMT
On Tue, Oct 19, 2010 at 11:10 AM, Shai Erera <serera@gmail.com> wrote:
> Is there real danger in having my analyzer not declaring these methods final
> - something that can affect Lucene code for example? Or am I only risking my
> code?
>

There is a real danger: bugs like
https://issues.apache.org/jira/browse/LUCENE-1678

I would love for us to re-think the whole
tokenStream/reusableTokenStream issue...

If someone doesn't override both (e.g. they just override
tokenStream), then it wouldnt actually use their subclasses code. So
then the reflection hack from LUCENE-1678 would force the analyzer to
never re-use, but instead call tokenStream: but this is very bad for
indexing performance!

Are there still real use cases where an analyzer cannot actually
reuse? For example, all Solr tokenstreams are reused. With an
application as big and widely used as that having no need for
non-reusable tokenStream(), I think we should seriously consider
simplifying the analysis api to be "reusable by default".

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message