lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: Analyzer forcing tokenStream and reusableTokenStream to be final
Date Tue, 19 Oct 2010 17:03:06 GMT
About the whole assertion (as it also affects TokenStreams). We want to make
sure that all Lucene/Solr TokenStreams and Analyzers are final or have final
implementation (even when we remove the reuseable method).

 

The idea is to simply only hit this assert for classes from the
org.apache.lucene package prefix! So we can test Lucene code, but for all
other subclasses we simply ignore. The method assertFinal can do this for
us:

 

Index: Analyzer.java

===================================================================

--- Analyzer.java   (revision 1023877)

+++ Analyzer.java   (working copy)

@@ -48,6 +48,8 @@

   private boolean assertFinal() {

     try {

       final Class<?> clazz = getClass();

+      if (!clazz.getName().startsWith("org.apache.lucene.")

+        return true;

       assert clazz.isAnonymousClass() ||

         (clazz.getModifiers() & (Modifier.FINAL | Modifier.PRIVATE)) != 0
||

         (

 

Same for TokenStream. This is no performance problem, as assertFinal is only
called when asserts are enabled (trick is "assert assertFinal();" in ctor).

 

Uwe

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

 <http://www.thetaphi.de/> http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: Uwe Schindler [mailto:uwe@thetaphi.de] 
Sent: Tuesday, October 19, 2010 6:18 PM
To: dev@lucene.apache.org
Subject: RE: Analyzer forcing tokenStream and reusableTokenStream to be
final

 

By the way, the same tests are done for TokenStream subclasses (whose impls
must be final in all cases - its defined as decorator pattern, so we enforce
it). And: You don't need to make the class itself final, its enough to make
both methods final.

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

http://www.thetaphi.de <http://www.thetaphi.de/> 

eMail: uwe@thetaphi.de

 

From: Shai Erera [mailto:serera@gmail.com] 
Sent: Tuesday, October 19, 2010 6:06 PM
To: dev@lucene.apache.org
Subject: Re: Analyzer forcing tokenStream and reusableTokenStream to be
final

 

I guess you didn't read my email all the way through - I cannot disable
assertions for Lucene stuff because I use Lucene's assertions to assert that
my indexing code works :).

Shai

On Tue, Oct 19, 2010 at 5:59 PM, Uwe Schindler <uwe@thetaphi.de> wrote:

We simply added that to *test* the bundled analyzers for conformance. If you
don't like that, you can simply disable assertions for the org.apache.lucene
package.

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

http://www.thetaphi.de <http://www.thetaphi.de/> 

eMail: uwe@thetaphi.de

 

From: Shai Erera [mailto:serera@gmail.com] 
Sent: Tuesday, October 19, 2010 5:53 PM
To: dev@lucene.apache.org
Subject: Re: Analyzer forcing tokenStream and reusableTokenStream to be
final

 

I still don't understand how not declaring my tokenStream and
reusableTokenStream final can break anything. The methods are there (in my
analyzers), and if I risk overriding them somewhere else it's my problem.

What am I missing?

To add to your email - I too didn't encounter an analyzer that cannot be
reused, yet.

Shai

On Tue, Oct 19, 2010 at 5:45 PM, Robert Muir <rcmuir@gmail.com> wrote:

On Tue, Oct 19, 2010 at 11:21 AM, Robert Muir <rcmuir@gmail.com> wrote:
> If someone doesn't override both (e.g. they just override
> tokenStream), then it wouldnt actually use their subclasses code. So
> then the reflection hack from LUCENE-1678 would force the analyzer to
> never re-use, but instead call tokenStream: but this is very bad for
> indexing performance!
>

Here's a jira issue with an example of how the
tokenstream/reusableTokenStream confusion makes this a real problem in
practice:

https://issues.apache.org/jira/browse/LUCENE-2279


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

 

 


Mime
View raw message