lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From RaviK Thakur <>
Subject Using Lucene for Moderate Similarity Check..
Date Tue, 09 Jun 2009 08:16:47 GMT

Hello All,
      I want to check the feasibility of using Lucene for similarity check
between the two flat csv files. The actual requirement is like this: We
have two files each containing the information of customers like their
name, address, pin code etc. Some customers may be in common in both the
files. We want to find the customer that are common in these files. But the
match should be on attribute basis. If the name of the customer matches in
one file to the name of the customer in another file, then match the
address, if it matches then match pin code and so on. But the main
consideration is that this matching is not exact. If the name of customer
matches say 80% then it may be termed as match. For example, if ABDUL is
matched with ABDULLAH, it should be termed as a match. In this fashion each
record of one file will be matched with each record of another file. The
output of this procedure will be another file containing the matched

Can anyone please suggest the applicability of lucene for this requirement.
May in the form of Pros n Cons.

Thanks in advance:-)


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message