lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <ser...@gmail.com>
Subject Re: Analyzer forcing tokenStream and reusableTokenStream to be final
Date Tue, 19 Oct 2010 16:11:11 GMT
My only problem is that w/o disabling asserts, I cannot bypass these checks.
Hence why I hoped we can limit the check itself to o.a.l code. For someone
who knows what he's doing, we don't allow him to inherit analyzers and
override these methods, yet we protect those who don't know what they're
doing.

It's frustrating :).

I'm all for removing one of them and declare the other one reusable and be
done with it, but I have a feeling this is a matter for a larger discussion.
What I'm asking here is for something much simpler - we don't jeopardize
Lucene code, and we document the risks of not overriding
reusableTokenStream.

Can't we change the assertion to not fail if the class declares
reusableTokenStream, yet nothing is final? Wouldn't that avoid the issues
you've mentioned?

Shai

On Tue, Oct 19, 2010 at 5:59 PM, Robert Muir <rcmuir@gmail.com> wrote:

> On Tue, Oct 19, 2010 at 11:52 AM, Shai Erera <serera@gmail.com> wrote:
> > I still don't understand how not declaring my tokenStream and
> > reusableTokenStream final can break anything. The methods are there (in
> my
> > analyzers), and if I risk overriding them somewhere else it's my problem.
> >
>
> Well it is your problem, but we created it with our confusing APIs :)
>
> So if you subclass your analyzer but only implement tokenStream and
> not also reusableTokenStream, you get very terrible performance like
> https://issues.apache.org/jira/browse/LUCENE-2279
>
> By enforcing these to be final we prevent the trap where you subclass
> and don't implement reusableTokenStream and get bad performance, but
> its still not completely solved.
> There is still the trap (especially with the attributes-based API,
> even more overhead), that you just implement an Analyzer with only
> tokenStream and get bad performance.
>
> If we only had one of these methods, lets say called "tokenStream",
> and it was reusable, we could remove these final checks completely.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Mime
View raw message