lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <>
Subject Re: sloppyFreq - why on Similarity?
Date Wed, 24 Sep 2003 18:31:26 GMT
Erik Hatcher wrote:
> WildcardQuery and FuzzyQuery do have the capability of affecting the 
> scoring, although only FuzzyQuery seems to take advantage of this.  
> There is no way on a Similarity implementation to affect the factors 
> applied by these queries though.  So there is some inconsistency on 
> these types of things.

I think this is historical.  Most of the other query classes are ones 
I've implemented, and I've added relevant methods to Similarity for 
them.  WildcardQuery and FuzzyQuery were contributed.  I've never used 
them in an application, as I think they're potential performance 
pitfalls, so I've probably ignored them when maintaining Similarity.  It 
is also a judgement call as to when something should be specified 
per-query (e.g., boost, phrase slop, etc.) and when it is a policy to be 
set for all queries (IDF computation, document length normalization, etc.).

> So now we need a getFuzzyFreq and getWildcardFreq?!  :)

I think you're being sarcastic, but, to be consistent, yes, if we 
believe these have parameters that are more about ranking policy than 
are query-specific.

The idea is to centrally locate the ranking policy.  I guess you could 
alternately make these all methods on various query classes.  But if, 
e.g. idf() were a method on TermQuery, it would make construction of 
generic query parsers more difficult.

Do you have another design to propose?


View raw message