lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: Analyzer forcing tokenStream and reusableTokenStream to be final
Date Tue, 19 Oct 2010 16:26:45 GMT
Trunk has new API why not clean all that stuff up? We already have no more really old analyzers,
as all needed to be upgraded before 3.0. If you did not add reset() it’s a bug as its clearly
documented to be a must?

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Robert Muir [mailto:rcmuir@gmail.com]
> Sent: Tuesday, October 19, 2010 6:21 PM
> To: dev@lucene.apache.org
> Subject: Re: Analyzer forcing tokenStream and reusableTokenStream to be final
> 
> On Tue, Oct 19, 2010 at 12:17 PM, DM Smith <dmsmith555@gmail.com>
> wrote:
> 
> > I'd be surprised if there are use cases for non-reuse.
> >
> > IIRC: When we started down the reuse path, the goal was reuse only, not just
> reuse by default. But in order to bridge the past to the future, there was the
> possibility of continued non-reuse. In a sense non-reuse was deprecated, but
> I'm not sure that @deprecated as a mechanism was able to clearly indicate
> that.
> >
> 
> Exactly: i don't think theres a clear way to detect that your
> tokenStream() method is "reuse-safe" and deprecate it: e.g. you have to
> implement reset() correctly in your tokenstreams.
> 
> But lets think about this: for non-experts, making Analyzer "reusable by default"
> by removing reusableTokenStream() and reusing
> tokenStream() would probably be the single largest indexing performance
> improvement we could make... the API is so confusing that I think many people
> probably have analyzers that aren't reusing today.
> 
> I think its worth considering a backwards break, especially since as Mike
> mentioned, for the very special (possibly even only theoretical!) non-reuse
> case, there are ways they could still index: but the "fast way" should be the
> "easy/default way".
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message