lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: Custom tokenizer
Date Mon, 12 Jan 2015 08:06:23 GMT
Hi,

Extending an existing Analyzer is not useful, because it is just a factory that returns a
TokenStream instance to consumers. If you want to change the Tokenizer of an existing Analyzer,
just clone it and rewrite its createComponents() method, see the example in the Javadocs:
http://lucene.apache.org/core/4_10_3/core/org/apache/lucene/analysis/Analyzer.html

If you want to add additional TokenFilters to the chain, you can do this with AnalyzerWrapper
(http://lucene.apache.org/core/4_10_3/core/org/apache/lucene/analysis/AnalyzerWrapper.html),
but this does not work with Tokenizers, because those are instantiated before the TokenFilters
which depend on them, so changing the Tokenizer afterwards is impossible.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Vihari Piratla [mailto:viharipiratla@gmail.com]
> Sent: Monday, January 12, 2015 8:51 AM
> To: java-user@lucene.apache.org
> Subject: Custom tokenizer
> 
> Hi,
> I am trying to implement a custom tokenizer for my application and I have
> few queries regarding the same.
> 1. Is there a way to provide an existing analyzer (say EnglishAnanlyzer) the
> custom tokenizer and make it use this tokenizer instead of say
> StandardTokenizer?
> 2. Why are analyzers such as Standard and EnglishAnalyzers defined final?
> Because of which, I cannot extend them.
> 
> Thank you.
> --
> V


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message