lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From eks dev <eks...@yahoo.co.uk>
Subject Re: question with spellchecker
Date Wed, 07 Jun 2006 06:00:55 GMT
try your query like ((ducted^1000 duct~2) +tape)
Or maybe (duct* +tape)
or even better you could try to do some stemming (Porter stemmer should get rid of these ed-suffixes)
and some of the above

if this does not help, have a look at lingpipe spellChecker class as this looks like exactly
what you need.

----- Original Message ----
From: Van Nguyen <vnguyen@wynnesystems.com>
To: java-user@lucene.apache.org
Sent: Wednesday, 7 June, 2006 2:49:52 AM
Subject: question with spellchecker

I'm implementing a spellchecker in my search and have a question.

 

After creating the index and spellchecker index, I pass in the word

"ducted tape" to search (I am expecting "duct tape" back).  

 

I've played around with boosting the prefixes and suffixes, setting the

accuracy, passing in an IndexReader and field to search on and setting

'morePopular' to true, but my search never returns "duct tape".  

 

>From the SpellChecker class, I see that for the word "ducted", it tries

to find the 

 

start3:duc^2.0 end3:ted gram3:duc gram3:uct gram3:cte gram3:ted 

start4:duct^2.0 end4:cted gram4:duct gram4:ucte gram4:cted

 

I checked to see if the word "duct" even exist in the spellchecker index

and it does.  I specified a number of similar words to return that

exceeds the number of results I get from a above mentioned query to see

if I can see all the terms that it the spellchecker is suggesting; but I

do not see "duct" as a word that it is even suggested.

 

The list it returns is:

 

[dotted, coated, ductile, plated, vented, mounted, united, listed,

ductape, reduced]

 

Anyone have suggestions as to how to proceed from here??

 

Van

This communication and any documents, files, or previous electronic mail messages attached
to it constitute

an electronic communication within the scope of the Electronic Communication Privacy Act,
18 USCA 2510. 

This communication may contain non-public, confidential, or legally privileged information
intended for the 

sole use of the designated recipient(s). The unlawful interception, use or disclosure of such
information is 

strictly prohibited under 18 USCA 2511 and any applicable laws.





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message