lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: Lucene for name matching
Date Thu, 05 Apr 2007 20:17:08 GMT
It's like deja vu all over again.  I literally just finished up a  
similar task (about 2 hours ago).  I didn't use Lucene for it,  
although I suppose I could have.  Lucene does have the FuzzyQuery  
javadoc/org/apache/lucene/search/FuzzyQuery.html) that uses  
Levenshtein as a place to start.

There are other string matching algorithms as well that are used in  
various approaches.  See   
Googling record linkage may help.  From there, you can pretty much  
knock yourself out with all the different approaches

On Apr 5, 2007, at 3:58 PM, moraleslos wrote:

> I was wondering if anyone has done people name matching using  
> Lucene.  For
> example, I have a name coming from some external source that I  
> would like to
> match with the one I have in my DB.  Lets say my DB contains the  
> name "John
> Smith".  If the external source has something like "Smith John",  
> "Smith,
> John", "J. Smith", etc., I would like to rate this matching based  
> on some %
> of closeness for review later.  I've searched around a bit for  
> algorithms
> and I kept seeing the Levenshtein distance algorithm which I'm sure  
> Lucene
> uses under the hood.  So I trying to guage if Lucene is useful for  
> doing
> something specific as this, or are there better algorithms and/or  
> software
> out there that does name matching.  Thanks in advance!
> -los
> -- 
> View this message in context: 
> matching-tf3533454.html#a9862342
> Sent from the Lucene - Java Users mailing list archive at
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Grant Ingersoll
Center for Natural Language Processing

Read the Lucene Java FAQ at 

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message