mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From prasenjit mukherjee <>
Subject Approximate String matching
Date Thu, 25 Jun 2009 04:19:06 GMT
    Please accept my apologies if you think this may not be the correct
forum. I am trying to find a solution for approximate string matching, where
I need to find all strings from a corpus which differs from a given pattern
at most by  "d" number of operations. And the allowed "d" operations are
insertion, deletion, substitution. Yes I am not interested in transposition
as it could be very expensive.

I looked into lingpipe they have a trie based solution in some class called
Aproximate*Chunker*. Any body has any better approach ?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message