lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From moraleslos <morales...@hotmail.com>
Subject Re: Lucene for name matching
Date Thu, 05 Apr 2007 20:33:34 GMT

Hi Grant!

Thanks for the reply.  I'll look into the links you suggested.  Just curious
though, what did you do to implement this--if you can spill some of the
beans  ;-)  You think what you did was better than the FuzzyQuery approach? 
Was it a custom algorithm or did you utilize some framework for this?  I
basically don't want to reinvent the wheel when doing this name matching
issue.  Thanks in advance!

-los



Grant Ingersoll-6 wrote:
> 
> It's like deja vu all over again.  I literally just finished up a  
> similar task (about 2 hours ago).  I didn't use Lucene for it,  
> although I suppose I could have.  Lucene does have the FuzzyQuery  
> (http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/ 
> javadoc/org/apache/lucene/search/FuzzyQuery.html) that uses  
> Levenshtein as a place to start.
> 
> There are other string matching algorithms as well that are used in  
> various approaches.  See http://en.wikipedia.org/wiki/Edit_distance.   
> Googling record linkage may help.  From there, you can pretty much  
> knock yourself out with all the different approaches
> 
> On Apr 5, 2007, at 3:58 PM, moraleslos wrote:
> 
>>
>> I was wondering if anyone has done people name matching using  
>> Lucene.  For
>> example, I have a name coming from some external source that I  
>> would like to
>> match with the one I have in my DB.  Lets say my DB contains the  
>> name "John
>> Smith".  If the external source has something like "Smith John",  
>> "Smith,
>> John", "J. Smith", etc., I would like to rate this matching based  
>> on some %
>> of closeness for review later.  I've searched around a bit for  
>> algorithms
>> and I kept seeing the Levenshtein distance algorithm which I'm sure  
>> Lucene
>> uses under the hood.  So I trying to guage if Lucene is useful for  
>> doing
>> something specific as this, or are there better algorithms and/or  
>> software
>> out there that does name matching.  Thanks in advance!
>>
>> -los
>> -- 
>> View this message in context: http://www.nabble.com/Lucene-for-name- 
>> matching-tf3533454.html#a9862342
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
> 
> --------------------------
> Grant Ingersoll
> Center for Natural Language Processing
> http://www.cnlp.org
> 
> Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/ 
> LuceneFAQ
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Lucene-for-name-matching-tf3533454.html#a9863587
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message