lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christian Aschoff <christian.asch...@uni-ulm.de>
Subject Where to place a filter...
Date Thu, 22 Nov 2007 19:27:20 GMT
Hello,

as a prolog, i have no problems and everything works the way i want :-)

I am more interested in a tip if i am using the right way or pattern.  
I want to strip accents before data goes into my index, so i wrote  
the code following below. I did not find an example of where to place  
a filter (for indexing) with google, so this is my guess of how to do  
it.

My question is: Is this the correct pattern for the usage of a filter  
or where should it be placed?

Thank you in advantage for any comments,
Christian

---------------------------------------------------------------
/*
  * RetroBibAnalyzer.java
  *
  * Created on 22. November 2007, 12:42
  *
  */

package de.retrobib.lucene;

import java.io.Reader;
import org.apache.log4j.Logger;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.de.GermanAnalyzer;
import org.apache.lucene.analysis.snowball.SnowballAnalyzer;

/**
  * Analyzer für den Lucene-Index. Zu Zeit nur ein
  * Wrapper um spätere Erweiterungen zu erleichtern.
  *
  * @author caschoff
  * @version 1.0
  */
public class RetroBibAnalyzer extends Analyzer {

     /**
      * <b>Jede</b> Klasse hat ihren Logger.
      */
     private static final Logger logger = Logger.getLogger 
(RetroBibAnalyzer.class);

     /** Der Analyzer. */
     private static final SnowballAnalyzer analyzer = new  
SnowballAnalyzer("German", GermanAnalyzer.GERMAN_STOP_WORDS);

     /** Creates a new instance of RetroBibAnalyzer */
     public RetroBibAnalyzer() {
         super();
     }

     /**
      * Den Tokenstream verarbeiten.
      *
      * @param fieldName Der Name des Feldes.
      * @param reader Der reader.
      * @return Der TokenStream.
      */
     public TokenStream tokenStream(String fieldName, Reader reader) {
         return new UTF8AccentFilter(analyzer.tokenStream(fieldName,  
reader));
     }

}
---------------------------------------------------------------

---
Dipl. Ing. (FH) Christian Aschoff

Büro:
Universität Ulm
Kommunikations- und Informationszentrum
Abt. Informationssysteme
Raum O26/5403
Albert-Einstein-Allee 11
89081 Ulm

Tel. 0731 50-22432
Fax. 0731 50-22471
christian.aschoff@uni-ulm.de

Privat:
Fabristr. 13
89075 Ulm
Deutschland/Old Europe

Tel. 0731 602 803 60
Fax. 0731 602 803 61
Mob. 0171 272 03 04
caschoff@mac.com

Helfen Sie mit: www.retrobibliothek.de




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message