lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Terry Steichen" <te...@net-frame.com>
Subject Sorting Messes up Scores
Date Mon, 22 Mar 2004 17:04:45 GMT
When you use the new sorting features, the relevance scores get messed up.
(A recent test showed most scores now range up to 3.0 or so.)  As Tim
suggests below, I'd like to know if fixing this is important to others.  (It
definitely is to me.)  If so, I'll submit it as a bug.

Regards,

Terry

----- Original Message -----
From: <tjones@apache.org>
To: "Terry Steichen" <terry@net-frame.com>
Sent: Monday, March 22, 2004 11:38 AM
Subject: Re: cvs commit: jakarta-lucene/src/java/org/apache/lucene/search
FieldSortedHitQueue.java


> Terry,
>
> Yes - that's correct - it's quite possible the scores will have values
> greater than 1.0 when sorted.  It's something that was just kind of
> ignored, figuring that when the results are sorted by something other
> than score, having normalized scores probably isn't so important.
>
> If it's a concern, please feel free to raise it on the dev list.
>
> Tim
>
>
> > I've looked more closely at the Sorting code and have a concern but I'm
not
> > smart enough to tell whether it's real or not.
> >
> > When the Hits class collects returned hits, it then normalizes the
score.
> > However, in doing this, it assumes that the returned hits (in the form
of a
> > TopDocs class) are ordered by score.  So it takes first item (index of 0
in
> > the array) in the returned hits and uses this as the normalization
factor.
> >
> > When you introduce the sorting, what the Hits class gets back is not
> > TopDocs, but TopFieldDocs, which has already been sorted in some order
other
> > than score.  Hence, the built-in assumption of Hits (that the first
document
> > in the array is the highest score and appropriate to use for
normalization)
> > no longer holds.  Consequently the normalization will be anything but
> > normalized.
> >
> > Again, I emphasize my technical limitations, but does this make sense to
> > you?
> >
> > Regards,
> >
> > Terry
> >
> > PS: BTW, it appears that, if I compile your code under 1.4, it runs just
> > fine under 1.3.1 (providing the regex lib references are removed, as per
> > your patch).
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message