lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Clemens Wyss DEV <clemens...@mysign.ch>
Subject Keeping capitalization in suggestions?
Date Thu, 04 Dec 2014 08:05:04 GMT
When I index a text such as "Chamäleon" and look for suggestions for "chamä" and/or "Chamä",
I'd expect to get "Chamäleon" (uppercased). 
But what happens is

If lowecasefilter (see below (1)) set
"chamä" returns "chamäleon"
"Chamä" does not match

If lowecasefilter (1) not set
"Chamä" returns "Chamäleon"
"chamä" does not match

I guess lowecasefilter should not be set/active, but then how do I get matches even if the
search term is lowercased?

Context:
schema.xml
...
    <fieldType class="solr.TextField" name="text_de" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_de.txt"/>
        <filter class="solr.GermanLightStemFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_de.txt"/>
        <filter class="solr.GermanLightStemFilterFactory"/>
      </analyzer>
    </fieldType>
...
    <fieldType class="solr.TextField" name="text_suggest" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.UAX29URLEmailTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
        <filter class="solr.LowerCaseFilterFactory"/> <!-- (1) -->
      </analyzer>
    </fieldType>

solrconfig.xml
-----------------
...
    <requestHandler class="org.apache.solr.handler.component.SearchHandler" name="/suggest">
        <lst name="defaults">
            <str name="echoParams">none</str>
            <str name="wt">json</str>
            <str name="indent">false</str>
            <str name="spellcheck">true</str>
            <str name="spellcheck.dictionary">suggestDictionary</str>
            <str name="spellcheck.onlyMorePopular">true</str>
            <str name="spellcheck.count">5</str>
            <str name="spellcheck.collate">false</str>
        </lst>
        <arr name="components">
            <str>suggest</str>
        </arr>
    </requestHandler>
...
    <searchComponent class="solr.SpellCheckComponent" name="suggest">
        <lst name="spellchecker">
            <str name="name">suggestDictionary</str>
            <str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
            <str name="lookupImpl">org.apache.solr.spelling.suggest.fst.FSTLookupFactory</str>
            <str name="field">suggest</str>
            <float name="threshold">0.</float>
            <str name="buildOnCommit">true</str>
        </lst>
    </searchComponent>
...

Mime
View raw message