lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Frank Wesemann <f.wesem...@fotofinder.com>
Subject SuggestComponent in distributed (SolrCloud) environment
Date Thu, 09 Oct 2014 19:29:29 GMT
Hi,
I'm about to integrate the SuggestCompont in our application and noticed
some behavior I didn't expect. My Solr version Solr 4.9.

1. The component returns common terms shards-n times.
2. Due to how the suggestions from each shard are collected, the
"exactMatchFirst" Parameter on the LookupImpl is practically ignored.

3. At least the Jaspell Lookup returns terms from deleted documents.

Is this expected behavior or am I missing something?
My config is quiet "defaulty" :
<searchComponent name="suggest" class="solr.SuggestComponent">
  <lst name="suggester" >
      <str name="lookupImpl">FSTLookupFactory</str>
      <str name="name">fst_mit_threshold</str>
      <str name="dictionaryImpl">HighFrequencyDictionaryFactory</str>
      <float name="threshold">0.0000007</float>
      <str name="storeDir">suggestions/</str>
      <str name="exactMatchFirst">true</str>
      <str name="field">suggest_context</str>
      <str name="suggestAnalyzerFieldType">suggestContextAnalyzer</str>
    </lst>
  </searchComponent>

  <requestHandler name="/suggest" class="solr.SearchHandler">
    <lst name="defaults">
      <str name="suggest">true</str>
      <str name="suggest.dictionary">default</str>
      <str name="suggest.dictionary">fst_mit_threshold</str>
      <str name="suggest.count">20</str>
      <str name="shards.qt">/suggest</str>
      <str name="wt">json</str>
    </lst>
    <arr name="components">
      <str>suggest</str>
    </arr>
  </requestHandler>

After a very short glimpse at the sources I think the two first issues
should be resolvable by plugging an other Queue implementation into
SuggestComponents finishStage()

I am quite unsure about no 3. At last these are suggestions, so nobody
guarantees to have results for the suggested terms, but it feels a little
strange from the users point of view.

Any thoughts on this?
If anybody is interested, I can open an Issue in JIRA and work on 1 and 2.


-- 
-- 
mit freundlichem Gruß,

Frank Wesemann
Fotofinder GmbH         USt-IdNr. DE812854514
Software Entwicklung    Web: http://www.fotofinder.com/
Potsdamer Str. 96       Tel: +49 30 25 79 28 90
10785 Berlin            Fax: +49 30 25 79 28 999

Sitz: Berlin
Amtsgericht Berlin Charlottenburg (HRB 73099)
Geschäftsführer: Ali Paczensky

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message