lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Taylor <paul_t...@fastmail.fm>
Subject StandardFilter not handling dots as exptected ?
Date Thu, 06 Aug 2009 14:03:30 GMT

Hi want the query "R.E.S" to match "R.E.S"

I use StandardFilter in my analyzer below and the description says:

    'Splits words at punctuation characters, removing punctuation. 
However, a dot that's not followed by whitespace is considered part of a 
token. '

so I thought that R.E.S. would become searchable as R.E.S, and the 
search would work, but it doesn't whereas searching for "R.E.S" does 
return a hit .

thanks Paul

public class StandardUnaccentAnalyzer extends Analyzer {

    public TokenStream tokenStream(String fieldName, Reader reader) {
        StandardTokenizer tokenStream = new StandardTokenizer(reader);
        TokenStream result = new StandardFilter(tokenStream);
        result = new LowerCaseFilter(result);
        return result;
    }
   
    private static final class SavedStreams {
        StandardTokenizer tokenStream;
        TokenStream filteredTokenStream;
    }
   
    public TokenStream reusableTokenStream(String fieldName, Reader 
reader) throws IOException {
        SavedStreams streams = (SavedStreams)getPreviousTokenStream();
        if (streams == null) {
            streams = new SavedStreams();
            setPreviousTokenStream(streams);
            streams.tokenStream = new StandardTokenizer(reader);
            streams.filteredTokenStream = new 
StandardFilter(streams.tokenStream);
            streams.filteredTokenStream = new 
LowerCaseFilter(streams.filteredTokenStream);
        }
        else {
            streams.tokenStream.reset(reader);
        }
        return streams.filteredTokenStream;
    }

}

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message