lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dyer, James" <James.D...@ingramcontent.com>
Subject RE: WordBreakSolrSpellChecker Usage
Date Thu, 11 Dec 2014 18:19:59 GMT
Matt,

There is no exact number here, but I would think most people would want "count" to be maybe
10-20.  Increasing this incurs a very small performance penalty for each term it generates
suggestions for, but you probably won't notice a difference.  For "maxCollationTries", 5 is
a reasonable number but you might see improved collations if this is also perhaps 10.  With
this one, you get a much larger performance penalty, but only when it need to try more combinations
to return the "maxCollations".  In your case you have this at 5 also, right?  I would reduce
this to the maximum number of re-written queries your application or users is actually going
to use.  In a lot of cases, 1 is the right number here.  This would improve performance for
you in some cases.

Possibly the reason “Rock point” > “Rockpoint” is failing is because you have "maxChanges"
set to 10.  This tells it you are willing for it to break a word into 10 separate parts, or
to combine up to 10 adjacent words into 1.  Having taken a quick glance at the code, I think
what is happening is it is trying things like "r ock p oint" and "r o ck p o int", etc and
never getting to your intended result.  In a typical scenario I would set "maxChanges" to
1-3, and often 1 is probably the most appropriate value here.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: Matt Mongeau [mailto:halogenandtoast@gmail.com] 
Sent: Thursday, December 11, 2014 11:34 AM
To: solr-user@lucene.apache.org
Subject: Re: WordBreakSolrSpellChecker Usage

Is there a suggested value for this. I bumped them up to 20 and still
nothing has seemed to change.

On Thu, Dec 11, 2014 at 9:42 AM, Dyer, James <James.Dyer@ingramcontent.com>
wrote:

> My first guess here, is seeing it works some of the time but not others,
> is that these values are too low:
>
> <str name="spellcheck.maxCollationTries">5</str>
> <str name="spellcheck.count">5</str>
>
> You know spellcheck.count is too low if the suggestion you want is not in
> the "suggestions" part of the response, but increasing it makes it get
> included.
>
> You know that spellcheck.maxCollationTries is too low if it exists in
> "suggestions" but it is not getting suggested in the "collation" section.
>
> James Dyer
> Ingram Content Group
> (615) 213-4311
>
>
> -----Original Message-----
> From: Matt Mongeau [mailto:halogenandtoast@gmail.com]
> Sent: Wednesday, December 10, 2014 12:43 PM
> To: solr-user@lucene.apache.org
> Subject: Fwd: WordBreakSolrSpellChecker Usage
>
> If I have my search component setup like this
> https://gist.github.com/halogenandtoast/cf9f296d01527080f18c and I have an
> entry for “Rockpoint” shouldn’t “Rock point” generate suggestions?
>
> This doesn't seem to be the case, but it works for "Blackstone" with "Black
> stone". Any ideas on what I might be doing wrong?
>
Mime
View raw message