lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Berkes Adam <>
Subject FuzzyLikeThis query and exact matches
Date Thu, 27 Aug 2009 12:22:33 GMT

In our java project we uses a (slightly modifed) version of 
FuzzyLikeThis query which

"For each source term the fuzzy variants are held in a BooleanQuery with 
no coord factor (because
 we are not looking for matches on multiple variants in any one doc). 
Additionally, a specialized
 TermQuery is used for variants and does not use that variant term's IDF 
because this would favour rarer
 terms eg misspellings. Instead, all variants use the same IDF ranking 
(the one for the source query
 term) and this is factored into the variant's boost. If the source 
query term does not exist in the
 index the average IDF of the variants is used."

In most cases it performs well but if there is short query term with (as 
usual) big number of variants the exact matches will be stay spreaded 
among the others which is not so useful: it should be "sorted" like (or 
forcibly set more relevant) exact matches and variant matches according 
to relevancy.
Is there any simple solution or already implemented contrib query class 
for this problem?

Best regards,
Adam Berkes,
Intland Software

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message