lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Harwood <>
Subject Re: FuzzyLikeThis query and exact matches
Date Thu, 27 Aug 2009 13:07:44 GMT
Despite making IDF a constant the edit distance should remain a factor  
in the rankings so I would have thought this would give you what you  

Can you supply a more detailed example? Either print the rewritten  
query or use the explain function


On 27 Aug 2009, at 13:22, Berkes Adam wrote:

> Hi,
> In our java project we uses a (slightly modifed) version of  
> FuzzyLikeThis query which
> "For each source term the fuzzy variants are held in a BooleanQuery  
> with no coord factor (because
> we are not looking for matches on multiple variants in any one doc).  
> Additionally, a specialized
> TermQuery is used for variants and does not use that variant term's  
> IDF because this would favour rarer
> terms eg misspellings. Instead, all variants use the same IDF  
> ranking (the one for the source query
> term) and this is factored into the variant's boost. If the source  
> query term does not exist in the
> index the average IDF of the variants is used."
> In most cases it performs well but if there is short query term with  
> (as usual) big number of variants the exact matches will be stay  
> spreaded among the others which is not so useful: it should be  
> "sorted" like (or forcibly set more relevant) exact matches and  
> variant matches according to relevancy.
> Is there any simple solution or already implemented contrib query  
> class for this problem?
> Best regards,
> Adam Berkes,
> Intland Software
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message