lucenenet-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allan, Brad (Bracknell)" <Brad.Al...@Fiserv.com>
Subject Getting fuzzy match information
Date Thu, 12 Dec 2013 16:22:59 GMT
Has anyone done or know of work done that would help me to get detailed information about my
hits with regard to fuzzy matches? Also very happy to receive suggestions :).

I'm looking to obtain the similarity percentage of each token in the each hit.

Example: fuzzy query looks something like this:
(name:80% similar to "john" or name:80% similar to "henry" or name:80% similar to "smith")
And I get hits:

*         Jon George Smythe

*         John Joe Henry

*         Smith John & Carter engineering
All valid hits, however my users want to be able to view the similarity and indeed prioritise
certain actions by being able to compare the results of 2 different searches (and therefore
normalised scores are not as useful as knowing the actual similarity information).

Clearly this sort of ability does not make sense when one is searching in large amounts of
data (documents), but in my case I'm searching through a set of names and some additional
person information.

Options could be to post process the hits and use/lift the FuzzyTermEnum logic to re-compute
the similarity value. Or perhaps extend the FuzzyQuery to register a 'listener' to receive
the information?
Other ideas? Thoughts?



________________________________

CheckFree Solutions Limited (trading as Fiserv)
Registered Office: Eversheds House, 70 Great Bridgewater Street, Manchester, M15 ES
Registered in England: No. 2694333

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message