lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walter Underwood <>
Subject Re: Any support for DoubleMetaphone ever putting out secondary tokens?
Date Sun, 28 Apr 2013 03:22:08 GMT
Double Metaphone is a good idea, but not that useful. Searchers just don't type in full phonetic
versions of their query. Nobody types "ratatooie", instead they type "rata" then stop instead
of making a mistake.

So, not that important.


On Apr 27, 2013, at 5:57 PM, Mark Bennett wrote:

> As I understand Wikipedia, Double Metaphone improves over Metaphone in 2 areas:
> 1: Better linguistic matching
> 2: Can output a secondary token for words like Schmidt
> A quick look at the Apache commons codec and Lucene filter, it doesn't seem like that
secondary token is supported?  There is "save" code for whether inject is true/false, but
that's not the same thing, and doesn't seem to have been extended.
> Either I'm reading it wrong?  Or it somehow produces a compound token in those cases?
> Looking on the web, one author claims that only 10% of names need a second token anyway,
so not a big deal, but still good to know.
> Thanks
> --
> Mark Bennett / New Idea Engineering, Inc. /
> Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513


View raw message