lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From karl wettin <ka...@snigel.net>
Subject Contextual suggestions
Date Fri, 31 Mar 2006 04:54:59 GMT
I've been working a bit with the spell checker. It does a pretty good  
job when it comes to finding a smiple typo.
I was thinking it would be nice if I could turn "heros light and  
magic" to "did you mean: heroes of might and magic?".

My strategy is to combine Markov, A* and Levenstein.

Algorithm:

First I have to train the Markov chain with the token offsets from  
Lucene.

At query time I choose the cheapest A* path though the Markov chain  
with as short Levenstien distance as possible.

I choose A* over breadth-first to allow zero-cost for stop words and  
future contextual boosting.

Any comments on this? Questions?

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message