lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From j <>
Subject word delimiter
Date Thu, 05 Aug 2010 13:50:16 GMT
I have UPPER12-lower and would like to be able to find it with queries
"UPPER" or "lower". What should break this up for the index? A
tokenizer or a filter such as WordDelimiterFilterFactory?

I have tried various combinations of parameters to
WordDelimiterFilterFactory and cant get it to split properly. Here are
the results from using standard tokenizer followed directly by the
WordDelimiterFilterFactory markup below (from analysis.jsp):

1                         | 2
UPPER12-lower | lower
UPPER              |
12                       |

<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
generateNumberParts="0" catenateWords="0" catenateNumbers="1"
catenateAll="0" splitOnCaseChange="1" preserveOriginal="1"/>

View raw message