lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Taylor <paul_t...@fastmail.fm>
Subject Using MappingCharFIlter in analyzer breaking wildcard matches
Date Mon, 25 Mar 2013 12:50:20 GMT
I created this simple StripSpacesAndSeparatorsAnalyzer so that it 
ignores certain characters such as hypens in the field so that I can 
search for

catno:WRATHCD25
catno:WRATHCD-25

and get the same results, and that works (the original value of the 
field added to the index was WRATHCD-25)

However there is a problem with wildcard searching

catno:WRATHCD25*

works, but

catno:WRATHCD-25*

does not

If I amend the analyzer to comment out the initReader() method then

catno:WRATHCD-25*

now works but of course

catno:WRATHCD25

no longer works.


Wham I doing wrong please


public class StripSpacesAndSeparatorsAnalyzer extends Analyzer {

     protected NormalizeCharMap charConvertMap;

     protected void setCharConvertMap() {

         NormalizeCharMap.Builder builder = new NormalizeCharMap.Builder();
         builder.add(" ","");
         builder.add("-","");
         builder.add("_","");
         builder.add(":","");
         charConvertMap = builder.build();
     }

     public StripSpacesAndSeparatorsAnalyzer() {
         setCharConvertMap();
     }

     @Override
     protected TokenStreamComponents createComponents(String fieldName, 
Reader reader) {
         Tokenizer source = new KeywordTokenizer(reader);
         TokenStream filter = new LowercaseFilter(source);
         return new TokenStreamComponents(source, filter);
     }


     @Override
     protected Reader initReader(String fieldName,
                                 Reader reader)
     {
         return new MappingCharFilter(charConvertMap, reader);
     }
}

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message