lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject Re: AW: AW: Wildcard search fails
Date Fri, 14 Aug 2009 10:56:23 GMT
> Noticed that in Luke... is there any existing analyzer around that 
> supports case-insensitive search and recognizes "RZ/G/17" as one token? 

As far as I know there is no built-in analyzer that uses whitespace tokenizer and lowercase
filter together. But it is easy to cast tokenizer and token filters to create a new analyzer.
You said that you were using SnowballAnalyzer, right? Just replace tokenizer from standard
to whitespace. Edit :

public TokenStream tokenStream(String fieldName, Reader reader) {
    TokenStream result = new WhitespaceTokenizer(reader);
    result = new LowerCaseFilter(result);
    if (stopSet != null)
      result = new StopFilter(result, stopSet);
    result = new SnowballFilter(result, name);
    return result;

By changing this method you can create custom analyzers. If you do not want stemming just
erase the line that contains SnowballFilter, etc. Since you have lowercasefilter in your analyzer,
your wildcard queries should be lowercased. Alternatively you can set setLowercaseExpandedTerms(true)
of your QueryParser.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message