lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alessandro Benedetti <abenede...@apache.org>
Subject Re: How to get infix match suggestions in solr
Date Tue, 06 Sep 2016 09:44:12 GMT
Hi Pradeep,
there are generally two possible approaches :

1) create a separate Solr instance for the auto-suggestion, elaborating the
service as complex as we need
2) use the Solr suggest component which comes out of the box with the
support for a lot of different approaches.

I can suggest you to read these couple of blogs :

http://alexbenedetti.blogspot.co.uk/2015/07/solr-you-complete-me.html
https://lucidworks.com/blog/2015/03/04/solr-suggester/

To answer your question in details I can quote part of my blog :

AnalyzingInfixLookupFactory<lst name="suggester">
  <str name="name">AnalyzingInfixSuggester</str>
  <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
  <str name="dictionaryImpl">DocumentDictionaryFactory</str>
  <str name="field">title</str>
  <str name="weightField">price</str>
  <str name="suggestAnalyzerFieldType">text_en</str>
</lst>


Description
*Data Structure* Auxiliary Lucene Index
*Building* For each Document, the *stored content* from the field is
*analyzed* according to the *suggestAnalyzerFieldType*and then additionally
* EdgeNgram *token filtered*.*
Finally an auxiliary index is built with those tokens.
*Lookup strategy* The query is analysed according to the
*suggestAnalyzerFieldType*.
Than a phrase search is triggered against the *Auxiliary Lucene index*
The suggestions are identified starting at the *beginning of each token* in
the field content.
*Suggestions returned* The *entire content* of the field .

This suggester is really common nowadays as it allows to provide
suggestions in the middle of a field content, taking advantage of the
analysis chain provided with the field.
It will be possible in this way to provide suggestions considering
*synonyms*, *stop words, stemming *and any other token filter used in the
analysis and match the suggestion based on *internal tokens*.

Let's see some example:

Query to autocompleteSuggestionsExplanation
*"gaming"*


   - "Video *gami*ng: the history"
   - "Video *game*s are an economic business"
   - "Video *game*: multiplayer gaming"

The input query is analysed, and the tokens produced are the following
: *"game"
.*
In the Auxiliary Index , for each of the field content we have the
EdgeNgram tokens:
"v","vi","vid"… , "g","ga","gam",*"game"* .
So the match happens and the suggestion are returned
*"ga"*

   - "Video *ga*ming: the history"
   - "Video *ga*mes are an economic business"
   - "Video *ga*me: multiplayer gaming"

The input query is analysed, and the tokens produced are the following : *"ga"
.*
In the Auxiliary Index , for each of the field content we have the
EdgeNgram tokens:
"v","vi","vid"… , "g",*"ga"*,"gam","game" .
So the match happens and the suggestion are returned

*"game econ"*

   - "Video *games *are an* econ*omic business"

Stop words will not appear in the Auxiliary Index.
Both "game" and "econ" will be, so the match applies.

Cheers

On Tue, Sep 6, 2016 at 5:34 AM, Pradeep Chandra <
pradeepchandra.551@gmail.com> wrote:

> Hi,
>
> Solr suggester is giving prefix suggestions in default. How to get infix
> matched suggestions? I am using AnylyzingInfixSuggestFactory. I don't know
> how to configure the Schema.xml & Solrconfig.xml. Can anyone help me.
>
> Thanks and Regards
> M Pradeep Chandra
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message