lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Davis <dansm...@gmail.com>
Subject Re: Spellchecker delivers far too few suggestions
Date Wed, 17 Dec 2014 16:19:00 GMT
What about the frequency comparison - I haven't used the spellchecker
heavily, but it seems that if "bnak" is in the database, but "bank" is much
more frequent, then "bank" should be a suggestion anyway...

On Wed, Dec 17, 2014 at 10:41 AM, Erick Erickson <erickerickson@gmail.com>
wrote:
>
> First, I'd look in your corpus for "bnak". The problem with index-based
> suggestions is that if your index contains garbage, they're "correctly
> spelled" since they're in the index. TermsComponent is very useful for
> this.
>
> You can also loosen up the match criteria, and as I remember the collations
> parameter does some permutations of the word (but my memory of how that
> works is shaky).
>
> Best,
> Erick
>
> On Wed, Dec 17, 2014 at 9:13 AM, Martin Dietze <mdietze@gmail.com> wrote:
> > I recently upgraded to SOLR 4.10.1 and after that set up the spell
> > checker which I use for returning suggestions after searches with few
> > or no results.
> > When the spellchecker is active, this request handler is used (most of
> > which is taken from examples I found in the net):
> >
> >   <requestHandler name="standardWithSpell" class="solr.SearchHandler"
> > default="false">
> >      <lst name="defaults">
> >        <str name="echoParams">explicit</str>
> >        <str name="spellcheck">true</str>
> >        <str name="spellcheck.onlyMorePopular">false</str>
> >        <str name="spellcheck.count">10</str>
> >        <str name="spellcheck.collate">false</str>
> >        <str name="q.alt">*:*</str>
> >        <str name="echoParams">explicit</str>
> >        <int name="rows">50</int>
> >        <str name="fl">*,score</str>
> >      </lst>
> >      <arr name="last-components">
> >        <str>spellcheck</str>
> >      </arr>
> >   </requestHandler>
> >
> > The search component is configured as follows (again most of it copied
> > from examples in the net):
> >
> >   <searchComponent name="spellcheck" class="solr.SpellCheckComponent">
> >     <str name="queryAnalyzerFieldType">text</str>
> >     <lst name="spellchecker">
> >       <str name="name">default</str>
> >       <str name="field">text</str>
> >       <str name="classname">solr.DirectSolrSpellChecker</str>
> >       <str name="distanceMeasure">internal</str>
> >       <float name="accuracy">0.3</float>
> >       <int name="maxEdits">2</int>
> >       <int name="minPrefix">1</int>
> >       <int name="maxInspections">5</int>
> >       <int name="minQueryLength">4</int>
> >       <float name="maxQueryFrequency">0.01</float>
> >       <float name="maxQueryFrequency">.01</float>
> >     </lst>
> >   </searchComponent>
> >
> > With this setup I can get suggestions for misspelled words. The
> > results on my developer machine were mostly fine, but on the test
> > system (much larger database, much larger search index) I found it
> > very hard to get suggestions at all. If for instance I misspell “bank”
> > as “bnak” I’d expect to get a suggestion for “bank” (since that word
> > can be found in the index very often).
> >
> > I’ve played around with maxQueryFrequency and maxQueryFrequency with
> > no success.
> >
> > Does anyone see any obvious misconfiguration? Anything that I could try?
> >
> > Any way I can debug this? (problem is that my application uses the
> > core API which makes trying out requests through the web interface
> > does not work)
> >
> > Any help would be greatly appreciated!
> >
> > Cheers,
> >
> > Martin
> >
> >
> > --
> > ---------- MDietze@gmail.com --/-- martin@the-little-red-haired-girl.org
> ----
> > ------------- / http://herbert.the-little-red-haired-girl.org /
> -------------
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message