lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: autosuggest with solr.EdgeNGramFilterFactory no result found
Date Fri, 03 Jul 2015 16:57:23 GMT
OK, I think you took a wrong turn at the bakery....

The FST-based suggesters are intended to look at the
beginnings of fields. It is totally unnecessary to use
ngrams, the FST that gets built does that _for_ you.
Actually it builds an internal FST structure that does
this "en passant".

For getting whole fields that are anywhere in the input
field, you probably want to think about
AnalyzingInfixSuggester or FreeTextSuggester.

The important bit here is that you shouldn't have to do
so much work...

This might help:

http://lucidworks.com/blog/solr-suggester/

Best,
Erick

On Fri, Jul 3, 2015 at 4:40 AM, Roland Szűcs
<roland.szucs@bookandwalk.com> wrote:
> I tried to setup an autosuggest feature with multiple dictionaries for
> title , author and publisher fields.
>
> I used the solr.EdgeNGramFilterFactory to optimize the performance of the
> auto suggest.
>
> I have a document in the index with title: Romana.
>
> When I test the text analysis for auto suggest (on filed of
> title_suggest_ngram):
> ENGTF
> textraw_bytesstartendpositionLengthtypeposition
> rom[72 6f 6d]061word1roma[72 6f 6d 61]061word1roman[72 6f 6d 61 6e]061word1
> romana[72 6f 6d 61 6e 61]061word1
> If I try to run http://localhost:8983/solr/bandw/suggest?q=Roma, I get:
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">1</int>
> </lst>
> <lst name="suggest">
> <lst name="suggest_publisher">
> <lst name="Roma">
> <int name="numFound">0</int>
> <arr name="suggestions"/>
> </lst>
> </lst>
> <lst name="suggest_title">
> <lst name="Roma">
> <int name="numFound">0</int>
> <arr name="suggestions"/>
> </lst>
> </lst>
> <lst name="suggest_author">
> <lst name="Roma">
> <int name="numFound">0</int>
> <arr name="suggestions"/>
> </lst>
> </lst>
> </lst>
> </response>
>
> my relevant field definitions:
> <field name="id" type="string" indexed="true" stored="true" required="true"
> multiValued="false" omitNorms="true" />
>    <field name="author" type="text_hu" indexed="true" stored="true"
> multiValued="true"/>
>    <field name="title" type="text_hu" indexed="true" stored="true"
> multiValued="false"/>
>    <field name="subtitle" type="text_hu" indexed="true" stored="true"
> multiValued="false"/>
>    <field name="publisher" type="text_hu" indexed="true" stored="true"
> multiValued="false"/>
> <field name="title_suggest_ngram" type="text_hu_suggest_ngram"
> indexed="true" stored="false" multiValued="false" omitNorms="true"/>
>    <field name="author_suggest_ngram" type="text_hu_suggest_ngram"
> indexed="true" stored="false" multiValued="false" omitNorms="true"/>
>    <field name="publisher_suggest_ngram" type="text_hu_suggest_ngram"
> indexed="true" stored="false" multiValued="false" omitNorms="true"/>
>    <copyField source="title" dest="title_suggest_ngram"/>
>    <copyField source="author" dest="author_suggest_ngram"/>
>    <copyField source="publisher" dest="publisher_suggest_ngram"/>
>
> My EdgeNGram related field type definition:
> <fieldType name="text_hu_suggest_ngram" class="solr.TextField"
> positionIncrementGap="100">
> <analyzer type="index">
> <tokenizer class="solr.StandardTokenizerFactory"/>
>         <filter class="solr.StopFilterFactory"
>                 ignoreCase="true"
>                 words="stopwords_hu.txt"
>                 />
>         <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
> maxGramSize="8"/>
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer class="solr.StandardTokenizerFactory"/>
>         <filter class="solr.StopFilterFactory"
>                 ignoreCase="true"
>                 words="stopwords_hu.txt"
>                 />
>         <filter class="solr.LowerCaseFilterFactory"/>
>       </analyzer>
> </fieldType>
>
> My requesthandler for suggest:
> <requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
> <lst name="defaults">
> <str name="suggest">true</str>
> <str name="suggest.count">5</str>
> <str name="suggest.dictionary">suggest_author</str>
> <str name="suggest.dictionary">suggest_title</str>
> <str name="suggest.dictionary">suggest_publisher</str>
> </lst>
> <arr name="components">
> <str>suggest</str>
> </arr>
>   </requestHandler>
>
> And finally my searchcomponent:
> <searchComponent name="suggest" class="solr.SuggestComponent">
> <lst name="suggester">
> <str name="name">suggest_title</str>
> <str name="lookupImpl">FSTLookupFactory</str>
> <str name="dictionaryImpl">DocumentDictionaryFactory</str>
> <str name="field">title_suggest_ngram</str>
> <str name="wightField">price</str>
> <str name="builOnStartup">true</str>
> <str name="buildOnCommit">true</str>
> </lst>
> <lst name="suggester">
> <str name="name">suggest_author</str>
> <str name="lookupImpl">FSTLookupFactory</str>
> <str name="dictionaryImpl">DocumentDictionaryFactory</str>
> <str name="field">author_suggest_ngram</str>
> <str name="wightField">price</str>
> <str name="builOnStartup">true</str>
> <str name="buildOnCommit">true</str>
> </lst>
> <lst name="suggester">
> <str name="name">suggest_publisher</str>
> <str name="lookupImpl">FSTLookupFactory</str>
> <str name="dictionaryImpl">DocumentDictionaryFactory</str>
> <str name="field">publisher_suggest_ngram</str>
> <str name="wightField">price</str>
> <str name="buildOnCommit">true</str>
> </lst>
>   </searchComponent>
> If I change the search component definition to use title field instead of
> title_suggest_ngram tahn I manage to get suggest results only if my title
> field starts with the string specified in q parameter.
> As a filed level autosuggester I would suggest also those matches which are
> not the first term of the title but any of them.
> What shall I make to use autosuggest correctly?
>
> --
> <https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu>Roland Szűcs
> <https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu>Connect with
> me on Linkedin <https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24>
> <https://bookandwalk.hu/>CEOPhone: +36 1 210 81 13Bookandwalk.hu
> <https://bokandwalk.hu/>

Mime
View raw message