lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From karl wettin <ka...@snigel.net>
Subject Re: Contextual suggestions
Date Mon, 03 Apr 2006 05:34:38 GMT

31 mar 2006 kl. 06.54 skrev karl wettin:

> I've been working a bit with the spell checker. It does a pretty  
> good job when it comes to finding a smiple typo.
> I was thinking it would be nice if I could turn "heros light and  
> magic" to "did you mean: heroes of might and magic?".
>
> My strategy is to combine Markov, A* and Levenstein.

> Any comments on this? Questions?

Nothing? Not even a go-go-go? I would really like to discuss it with  
someone before I spend too much time on it. This is what it is: a  
simple Markov chain is similar to ngrams, but on a word level rather  
than character level. A* is a classic gaming algorithm to find the  
cheapest path in a matrix. I assume you all know Levenstein from  
FuzzyQuery.

I have been sleeping on this a bit and think it might not work on a  
big corpus. One probably have to limit it to one Markov chain per  
context of some kind. Say category or so.

Perhaps there is some other forum more focused on text analysis you  
would like to recommend me?

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message