lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Terry Smith <sheb...@gmail.com>
Subject CustomAnalyzer and AttributeFactories
Date Thu, 14 Jul 2016 17:26:28 GMT
I've hit a runtime issue when consuming the nightly 7.0.0-SNAPSHOT maven
build and was wondering if someone could shed some light on it.

Some custom code is causing the following exception:

java.lang.IllegalArgumentException: State contains AttributeImpl of type
org.apache.lucene.analysis.tokenattributes.CharTermAttributeImpl that is
not in in this AttributeSource

Which seems to be related to the change made by
https://issues.apache.org/jira/browse/LUCENE-7355 which changes
CustomAnalyzer like so:

   protected TokenStreamComponents createComponents(String fieldName) {

-    final Tokenizer tk = tokenizer.create();

+    final Tokenizer tk = tokenizer.create(attributeFactory());

I'm trying to untangle the attribute factory logic and have so far figured
out that CustomAnalyzer is now using AttributeFactory.DEFAULT_ATTRIBUTE_FACTORY
whereas it used to use TokenStream.DEFAULT_TOKEN_ATTRIBUTE_FACTORY.

The old default would use a PackedTokenAttributeImpl where as the new
default seems to create a class for each of the common attributes.

Unfortunately, CustomAnalyzer is final (with a default protected
constructor) making it impossible to extend and override this
attributeFactrory method.

I think I should fix this by making CustomAnalyzer use the previous default
attribute factory that uses PackedTokenAttributeImpl but am unsure how to
achieve that.

Am I understanding my problem and ideal solution correctly here? If so,
should I submit a patch to open up CustomAnalyzer to achieve this goal?

--Terry

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message