lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Spencer <dave-lucene-u...@tropo.com>
Subject Re: NGramSpeller contribution -- Re: combining open office spellchecker with Lucene
Date Wed, 15 Sep 2004 19:54:54 GMT
Aad Nales wrote:

> By trying: if you type const you will find that it returns 216 hits. The
> third sports 'const' as a term (space seperated and all). I would expect
> 'conts' to return with const as well. But again I might be mistaken. I
> am now trying to figure what the problem might be: 
> 
> 1. my expectations (most likely ;-)
> 2. something in the code..



I enhanced the code to store simple transpositions also and I 
regenerated my site w/ ngrams from 2 to 5 chars. If you set the 
transposition boost up to 10 then "const" is returned 2nd...

http://www.searchmorph.com/kat/spell.jsp?s=conts&min=2&max=5&maxd=5&maxr=10&bstart=2.0&bend=1.0&btranspose=10.0&popular=1

> 
> -----Original Message-----
> From: Andrzej Bialecki [mailto:ab@getopt.org] 
> Sent: Wednesday, 15 September, 2004 12:23
> To: Lucene Users List
> Subject: Re: NGramSpeller contribution -- Re: combining open office
> spellchecker with Lucene
> 
> 
> Aad Nales wrote:
> 
> 
>>David,
>>
>>Perhaps I misunderstand somehting so please correct me if I do. I used
> 
> 
>>http://www.searchmorph.com/kat/spell.jsp to look for conts without 
>>changing any of the default values. What I got as results did not 
>>include 'const' which has quite a high frequency in your index and
> 
> 
> ??? how do you know that? Remember, this is an index of _Java_docs, and 
> "const" is not a Java keyword.
> 
> 
>>should have a pretty low levenshtein distance. Any idea what causes 
>>this behavior?
> 
> 
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message