lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benson Margulies <ben...@basistech.com>
Subject Analyzer classes versus the constituent components
Date Tue, 08 Oct 2013 14:30:35 GMT
Is there some advice around about when it's appropriate to create an
Analyzer class, as opposed to just Tokenizer and TokenFilter classes?

The advantage of the constituent elements is that they allow the
consuming application to add more filters. The only disadvantage I see
is that the following is a bit on the verbose side. Is there some
advantage or use of an Analyzer class that I'm missing?

private Analyzer newAnalyzer() {
        return new Analyzer() {
            @Override
            protected TokenStreamComponents createComponents(String fieldName,
                                                             Reader reader) {
                Tokenizer source = tokenizerFactory.create(reader,
LanguageCode.JAPANESE);
                com.basistech.rosette.bl.Analyzer rblAnalyzer;
                try {
                    rblAnalyzer = analyzerFactory.create(LanguageCode.JAPANESE);
                } catch (IOException e) {
                    throw new RuntimeException("Error creating RBL
analyzer", e);
                }
                BaseLinguisticsTokenFilter filter = new
BaseLinguisticsTokenFilter(source, rblAnalyzer);
                return new TokenStreamComponents(source, filter);
            }
        };
    }

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message