lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: comparing lucene scores across queries
Date Mon, 28 Mar 2011 08:03:52 GMT
No, scores are in general not comparable between different queries. The
problem lies in many things:
- Each query has a norm factor that makes it more compareable if they are
sub clauses of a BooleanQuery. But you are right, this norm factor should be
the same. 
- Some queries like FuzzyQuery rely on the terms in index and those matches
the query
- Inside Boolean queries, there is also a coord-factor involved

If you are always using the same simple type of query (e.g. simple
TermQuery, only with different term) on the same index, you can compare the
scores. As soon as you are using complex queries (e.g several terms compared
in a BooleanQuery as QueryParser produces), the scores are no longer
comparable.

You can read more on all factors that are included in scoring:
http://lucene.apache.org/java/3_0_3/api/core/org/apache/lucene/search/Simila
rity.html

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Patrick Diviacco [mailto:patrick.diviacco@gmail.com]
> Sent: Monday, March 28, 2011 9:44 AM
> To: java-user@lucene.apache.org
> Subject: comparing lucene scores across queries
> 
> Hi,
> 
> sorry I've already asked few days ago, but I got no reply and I really
need
> some help on this..
> 
> I'm running several queries against a doc collection. The queries are
> documents of the collection itself, I need to measure how similar is each
> document to the rest of the collection.
> 
> Now, Lucene returns me a score per query, but I've been told such score is
> not comparable across queries. Is this correct ?
> 
> For example, arem't these scores comparable ?
> query1, score:8.324234
> query2, score:3.324238
> 
> If so, why not ? Isn't the cosine similarity between the query vector and
> collection docs vectors ? I really need a comparable measure.
> 
> thanks


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message