lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vihari Piratla <viharipira...@gmail.com>
Subject Re: Custom tokenizer
Date Mon, 12 Jan 2015 08:15:43 GMT
Thanks for the reply.

Hmm, I understand.
I know about AnalyzerWrapper, but that is not what I am looking for.

I also know about cloning and overriding. I want my analyzer to behave
exactly the same as EnglishAnalyzer and right now I am copying the code
from the EnglishAnalyzer to mimic the behavior, which is a dirty solution.
Is there any other proper solution(s) to this problem?

Thank you.

On Mon, Jan 12, 2015 at 1:36 PM, Uwe Schindler <uwe@thetaphi.de> wrote:

> Hi,
>
> Extending an existing Analyzer is not useful, because it is just a factory
> that returns a TokenStream instance to consumers. If you want to change the
> Tokenizer of an existing Analyzer, just clone it and rewrite its
> createComponents() method, see the example in the Javadocs:
> http://lucene.apache.org/core/4_10_3/core/org/apache/lucene/analysis/Analyzer.html
>
> If you want to add additional TokenFilters to the chain, you can do this
> with AnalyzerWrapper (
> http://lucene.apache.org/core/4_10_3/core/org/apache/lucene/analysis/AnalyzerWrapper.html),
> but this does not work with Tokenizers, because those are instantiated
> before the TokenFilters which depend on them, so changing the Tokenizer
> afterwards is impossible.
>
> Uwe
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>
> > -----Original Message-----
> > From: Vihari Piratla [mailto:viharipiratla@gmail.com]
> > Sent: Monday, January 12, 2015 8:51 AM
> > To: java-user@lucene.apache.org
> > Subject: Custom tokenizer
> >
> > Hi,
> > I am trying to implement a custom tokenizer for my application and I have
> > few queries regarding the same.
> > 1. Is there a way to provide an existing analyzer (say EnglishAnanlyzer)
> the
> > custom tokenizer and make it use this tokenizer instead of say
> > StandardTokenizer?
> > 2. Why are analyzers such as Standard and EnglishAnalyzers defined final?
> > Because of which, I cannot extend them.
> >
> > Thank you.
> > --
> > V
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
V

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message