lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: theoretical maximum score
Date Fri, 16 May 2008 22:04:31 GMT

: Is it possible to compute a theoretical maximum score for a given query if
: constraints are placed on 'tf' and 'lengthNorm'? If so, scores could be
: compared to a 'perfect score' (a feature request from our customers)

without thinking about it two hard, you'd also need to constrain:
 * field and document boosts (which are combined with lengthNorm to create 
   the fieldNorm)
 * query time boosts
 * idf (there's no law that says Similarity.idf has to return a number 
   less then 1) 
 * queryNorm'd also need to use query structures that are simple -- a 
ValueSourceQuery can break all the rules it wants -- but if you used only 
basic types of queries (boolean, term, phrase) and you imposed those 
constraints you could probably pull it off.

the key thing to watch out for is that even if you can do it, and you can 
start to say meaninful things like "doc A scored X out of a max possible Y 
against query Q" that doesn't neccessarily help you compare that with doc 
B which scored V out of a max possible W against query R ... even if X/Y == V/W 
... because the *structure* and complexity of Q and R play havock 
with the scores.

but comparisons like that are what people are going to start to do as 
soome as you give them a number like that.  People are going to start to 
tihnk "well doc A is an N% match for Q, and doc B is an N% match for R, 
but A is clearly a better match Q then B is for R .. what the heck?"

...i would do a lot of "subjective" testing using a variety of queries of 
various complexities before i put any faith in producing a number like 

I suspect what you'd find is that the constraints you need to impose in 
order to give you a meaningful number wind up hobbling Lucene so much it 
doesn't do a very good job of scoring anything.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message