lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From PeterKerk <vettepa...@hotmail.com>
Subject Displaying actual field values and searching lowercase ignoring spaces
Date Mon, 09 Dec 2013 17:43:38 GMT
Values of the field [street] in my DB may be "Castle Road"

However, I want to be able to find these values using lowercase including
dashes, so "castle-road" would be a match.

When I use fieldtype "text_lower_space", which holds a
solr.WhitespaceTokenizerFactory, the value is split in 2 values, "Castle"
and "Road". 

When I use type "string" of fieldtype "solr.StrField", I can not search
lowercase and still find values which hold uppercase characters, such as
"Castle Road".

I need to be able to find values (regardless of their casing) using a
lowercase query.

I will be using the [street] field to display facets, so the text displayed
to the user should be the exact value including casing from field [street],
however, when I search on the field, "castle-road" should return a match.

original value		found on
Castle Road			castle-road
Oak-tree lane		oak-tree-lane


The problem now is that I don't know which tokenizer I need to use, both for
index and query.


    <fieldType name="text_lower_space" class="solr.TextField"
positionIncrementGap="100">
      <analyzer type="index">
        
		<tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        
		<tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType>	




--
View this message in context: http://lucene.472066.n3.nabble.com/Displaying-actual-field-values-and-searching-lowercase-ignoring-spaces-tp4105723.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message