lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Sokolov <msoko...@safaribooksonline.com>
Subject Re: Keeping capitalization in suggestions?
Date Thu, 04 Dec 2014 13:04:39 GMT
Have a look at AnalyzingInfixSuggester - it does what you want.

-Mike

On 12/4/14 3:05 AM, Clemens Wyss DEV wrote:
> When I index a text such as "Chamäleon" and look for suggestions for "chamä" and/or
"Chamä", I'd expect to get "Chamäleon" (uppercased).
> But what happens is
>
> If lowecasefilter (see below (1)) set
> "chamä" returns "chamäleon"
> "Chamä" does not match
>
> If lowecasefilter (1) not set
> "Chamä" returns "Chamäleon"
> "chamä" does not match
>
> I guess lowecasefilter should not be set/active, but then how do I get matches even if
the search term is lowercased?
>
> Context:
> schema.xml
> ...
>      <fieldType class="solr.TextField" name="text_de" positionIncrementGap="100">
>        <analyzer type="index">
>          <tokenizer class="solr.StandardTokenizerFactory"/>
>          <filter class="solr.LowerCaseFilterFactory"/>
>          <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_de.txt"/>
>          <filter class="solr.GermanLightStemFilterFactory"/>
>        </analyzer>
>        <analyzer type="query">
>          <tokenizer class="solr.StandardTokenizerFactory"/>
>          <filter class="solr.SynonymFilterFactory" expand="true" ignoreCase="true"
synonyms="synonyms.txt"/>
>          <filter class="solr.LowerCaseFilterFactory"/>
>          <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_de.txt"/>
>          <filter class="solr.GermanLightStemFilterFactory"/>
>        </analyzer>
>      </fieldType>
> ...
>      <fieldType class="solr.TextField" name="text_suggest" positionIncrementGap="100">
>        <analyzer>
>          <tokenizer class="solr.UAX29URLEmailTokenizerFactory"/>
>          <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
>          <filter class="solr.LowerCaseFilterFactory"/> <!-- (1) -->
>        </analyzer>
>      </fieldType>
>
> solrconfig.xml
> -----------------
> ...
>      <requestHandler class="org.apache.solr.handler.component.SearchHandler" name="/suggest">
>          <lst name="defaults">
>              <str name="echoParams">none</str>
>              <str name="wt">json</str>
>              <str name="indent">false</str>
>              <str name="spellcheck">true</str>
>              <str name="spellcheck.dictionary">suggestDictionary</str>
>              <str name="spellcheck.onlyMorePopular">true</str>
>              <str name="spellcheck.count">5</str>
>              <str name="spellcheck.collate">false</str>
>          </lst>
>          <arr name="components">
>              <str>suggest</str>
>          </arr>
>      </requestHandler>
> ...
>      <searchComponent class="solr.SpellCheckComponent" name="suggest">
>          <lst name="spellchecker">
>              <str name="name">suggestDictionary</str>
>              <str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
>              <str name="lookupImpl">org.apache.solr.spelling.suggest.fst.FSTLookupFactory</str>
>              <str name="field">suggest</str>
>              <float name="threshold">0.</float>
>              <str name="buildOnCommit">true</str>
>          </lst>
>      </searchComponent>
> ...
>


Mime
View raw message