lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Caroline Collet <caroline.col...@pertimm.com>
Subject Re: Lucene DirectSpellChecker strange behavior
Date Tue, 07 Jun 2016 15:28:53 GMT
Thank you for your prompt reply this makes perfect sense.

Le 07/06/2016 17:24, Robert Muir a écrit :
> Its just a heuristic: that it does not allow 2 edits 
> (insertion/deletion/substitution/transposition) to the word if the 
> first character differs 
> (https://github.com/apache/lucene-solr/blob/master/lucene/suggest/src/java/org/apache/lucene/search/spell/DirectSpellChecker.java#L411).

> So when it goes back for n=2, it requires the first character to match.
>
> At least at the time the thing was written, this has a very large 
> impact on performance, because otherwise too much of the term 
> dictionary must be inspected and its much slower. The idea is, it 
> won't hurt too much on quality, for the same reasons that many of 
> these string distance functions incorporate a bias towards the 
> matching prefix (e.g. jaro winkler).
>
>
> On Tue, Jun 7, 2016 at 5:20 AM, Caroline Collet 
> <caroline.collet@pertimm.com <mailto:caroline.collet@pertimm.com>> wrote:
>
>     Hello,
>
>     I have a very strange behavior when I use the DirectSpellChecker
>     of Lucene. I have set the prefixLength to 0. I have indexed only
>     one item with one field : brand=samsung.
>     I have tried to make requests with spelling mistakes inside.
>
>     When I search for "smsng" I obtain "samsung" which is logical
>     since I only have 2 corrections to make to obtain "samsung"
>     When I search for "amsung" I obtain "samsung" since I have set the
>     prefixLenght to 0
>     But when I search "amung" which only has 2 errors, I do not obtain
>     "samsung", I obtain nothing.
>
>     I don't understand this behaviour, it is like no other correction
>     is permitted if the first letter is misspelled.
>
>     Did I miss some parameters of the spellchecker that could explain
>     this behavior?
>
>     I precise that I use :
>     - Lucene 5.5.0
>     - JRE 1.8
>
>     Thank you in advance for taking time to answer my question,
>     Bests regards,
>     -- 
>     PERTIMM <http://www.pertimm.com/fr/> 	
>
>     Caroline Collet
>     Ingénieur développement
>
>     Tel : +33 (0)1 80 04 82 89 <tel:%2B33%20%280%291%2080%2004%2082%2089>
>     caroline.collet@pertimm.com <mailto:caroline.collet@pertimm.com>
>     http://www.pertimm.com/fr/
>
>     	
>
>     Pertimm
>     51, boulevard Voltaire
>     92600 Asnières-Sur-Seine, France
>
>
>
>
-- 
PERTIMM <http://www.pertimm.com/fr/> 	

Caroline Collet
Ingénieur développement

Tel : +33 (0)1 80 04 82 89
caroline.collet@pertimm.com <mailto:caroline.collet@pertimm.com>
http://www.pertimm.com/fr/

	

Pertimm
51, boulevard Voltaire
92600 Asnières-Sur-Seine, France




Mime
View raw message