lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jamie M <jami...@yahoo.com>
Subject Re: Sorting Messes up Scores
Date Mon, 22 Mar 2004 21:29:54 GMT
yes, scores are important to me too even when the
results aren't sorted by score.

jamie

--- Terry Steichen <terry@net-frame.com> wrote:
> When you use the new sorting features, the relevance
> scores get messed up.
> (A recent test showed most scores now range up to
> 3.0 or so.)  As Tim
> suggests below, I'd like to know if fixing this is
> important to others.  (It
> definitely is to me.)  If so, I'll submit it as a
> bug.
> 
> Regards,
> 
> Terry
> 
> ----- Original Message -----
> From: <tjones@apache.org>
> To: "Terry Steichen" <terry@net-frame.com>
> Sent: Monday, March 22, 2004 11:38 AM
> Subject: Re: cvs commit:
> jakarta-lucene/src/java/org/apache/lucene/search
> FieldSortedHitQueue.java
> 
> 
> > Terry,
> >
> > Yes - that's correct - it's quite possible the
> scores will have values
> > greater than 1.0 when sorted.  It's something that
> was just kind of
> > ignored, figuring that when the results are sorted
> by something other
> > than score, having normalized scores probably
> isn't so important.
> >
> > If it's a concern, please feel free to raise it on
> the dev list.
> >
> > Tim
> >
> >
> > > I've looked more closely at the Sorting code and
> have a concern but I'm
> not
> > > smart enough to tell whether it's real or not.
> > >
> > > When the Hits class collects returned hits, it
> then normalizes the
> score.
> > > However, in doing this, it assumes that the
> returned hits (in the form
> of a
> > > TopDocs class) are ordered by score.  So it
> takes first item (index of 0
> in
> > > the array) in the returned hits and uses this as
> the normalization
> factor.
> > >
> > > When you introduce the sorting, what the Hits
> class gets back is not
> > > TopDocs, but TopFieldDocs, which has already
> been sorted in some order
> other
> > > than score.  Hence, the built-in assumption of
> Hits (that the first
> document
> > > in the array is the highest score and
> appropriate to use for
> normalization)
> > > no longer holds.  Consequently the normalization
> will be anything but
> > > normalized.
> > >
> > > Again, I emphasize my technical limitations, but
> does this make sense to
> > > you?
> > >
> > > Regards,
> > >
> > > Terry
> > >
> > > PS: BTW, it appears that, if I compile your code
> under 1.4, it runs just
> > > fine under 1.3.1 (providing the regex lib
> references are removed, as per
> > > your patch).
> >
> >
> >
> 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
> lucene-dev-help@jakarta.apache.org
> 


__________________________________
Do you Yahoo!?
Yahoo! Finance Tax Center - File online. File on time.
http://taxes.yahoo.com/filing.html

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message