lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: A question about scoring function in Lucene
Date Wed, 15 Dec 2004 20:35:04 GMT
Chris Hostetter wrote:
> For example, using the current scoring equation, if i do a search for
> "Doug Cutting" and the results/scores i get back are...
>       1:   0.9
>       2:   0.3
>       3:   0.21
>       4:   0.21
>       5:   0.1
> ...then there are at least two meaningful pieces of data I can glean:
>    a) document #1 is significantly better then the other results
>    b) document #3 and #4 are both equaly relevant to "Doug Cutting"
> 
> If I then do a search for "Chris Hostetter" and get back the following
> results/scores...
>       9:   0.9
>       8:   0.3
>       7:   0.21
>       6:   0.21
>       5:   0.1
> 
> ...then I can assume the same corrisponding information is true about my
> new search term (#9 is significantly better, and #7/#8 are equally as good)
> 
> However, I *cannot* say either of the following:
>   x) document #9 is as relevant for "Chris Hostetter" as document #1 is
>      relevant to "Doug Cutting"
>   y) document #5 is equally relevant to both "Chris Hostetter" and
>      "Doug Cutting"

That's right.  Thanks for the nice description of the issue.

> I think the OP is arguing that if the scoring algorithm was modified in
> the way they suggested, then you would be able to make statements x & y.

And I am not convinced that, with the changes Chuck describes, one can 
be any more confident of x and y.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message