lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roland Szűcs <roland.sz...@bookandwalk.com>
Subject autosuggest with solr.EdgeNGramFilterFactory no result found
Date Fri, 03 Jul 2015 11:40:07 GMT
I tried to setup an autosuggest feature with multiple dictionaries for
title , author and publisher fields.

I used the solr.EdgeNGramFilterFactory to optimize the performance of the
auto suggest.

I have a document in the index with title: Romana.

When I test the text analysis for auto suggest (on filed of
title_suggest_ngram):
ENGTF
textraw_bytesstartendpositionLengthtypeposition
rom[72 6f 6d]061word1roma[72 6f 6d 61]061word1roman[72 6f 6d 61 6e]061word1
romana[72 6f 6d 61 6e 61]061word1
If I try to run http://localhost:8983/solr/bandw/suggest?q=Roma, I get:
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1</int>
</lst>
<lst name="suggest">
<lst name="suggest_publisher">
<lst name="Roma">
<int name="numFound">0</int>
<arr name="suggestions"/>
</lst>
</lst>
<lst name="suggest_title">
<lst name="Roma">
<int name="numFound">0</int>
<arr name="suggestions"/>
</lst>
</lst>
<lst name="suggest_author">
<lst name="Roma">
<int name="numFound">0</int>
<arr name="suggestions"/>
</lst>
</lst>
</lst>
</response>

my relevant field definitions:
<field name="id" type="string" indexed="true" stored="true" required="true"
multiValued="false" omitNorms="true" />
   <field name="author" type="text_hu" indexed="true" stored="true"
multiValued="true"/>
   <field name="title" type="text_hu" indexed="true" stored="true"
multiValued="false"/>
   <field name="subtitle" type="text_hu" indexed="true" stored="true"
multiValued="false"/>
   <field name="publisher" type="text_hu" indexed="true" stored="true"
multiValued="false"/>
<field name="title_suggest_ngram" type="text_hu_suggest_ngram"
indexed="true" stored="false" multiValued="false" omitNorms="true"/>
   <field name="author_suggest_ngram" type="text_hu_suggest_ngram"
indexed="true" stored="false" multiValued="false" omitNorms="true"/>
   <field name="publisher_suggest_ngram" type="text_hu_suggest_ngram"
indexed="true" stored="false" multiValued="false" omitNorms="true"/>
   <copyField source="title" dest="title_suggest_ngram"/>
   <copyField source="author" dest="author_suggest_ngram"/>
   <copyField source="publisher" dest="publisher_suggest_ngram"/>

My EdgeNGram related field type definition:
<fieldType name="text_hu_suggest_ngram" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="stopwords_hu.txt"
                />
        <filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
maxGramSize="8"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="stopwords_hu.txt"
                />
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
</fieldType>

My requesthandler for suggest:
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.count">5</str>
<str name="suggest.dictionary">suggest_author</str>
<str name="suggest.dictionary">suggest_title</str>
<str name="suggest.dictionary">suggest_publisher</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
  </requestHandler>

And finally my searchcomponent:
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">suggest_title</str>
<str name="lookupImpl">FSTLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">title_suggest_ngram</str>
<str name="wightField">price</str>
<str name="builOnStartup">true</str>
<str name="buildOnCommit">true</str>
</lst>
<lst name="suggester">
<str name="name">suggest_author</str>
<str name="lookupImpl">FSTLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">author_suggest_ngram</str>
<str name="wightField">price</str>
<str name="builOnStartup">true</str>
<str name="buildOnCommit">true</str>
</lst>
<lst name="suggester">
<str name="name">suggest_publisher</str>
<str name="lookupImpl">FSTLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">publisher_suggest_ngram</str>
<str name="wightField">price</str>
<str name="buildOnCommit">true</str>
</lst>
  </searchComponent>
If I change the search component definition to use title field instead of
title_suggest_ngram tahn I manage to get suggest results only if my title
field starts with the string specified in q parameter.
As a filed level autosuggester I would suggest also those matches which are
not the first term of the title but any of them.
What shall I make to use autosuggest correctly?

-- 
<https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu>Roland Szűcs
<https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu>Connect with
me on Linkedin <https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24>
<https://bookandwalk.hu/>CEOPhone: +36 1 210 81 13Bookandwalk.hu
<https://bokandwalk.hu/>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message