lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Fraschetti <frasche...@gmail.com>
Subject Re: sorting and score ordering
Date Fri, 15 Oct 2004 05:57:36 GMT
I've added the raw lucene source to my IDE and have attempted to debug
the sorting portion of lucene... which has given me a better insight,
but I still do not quite understand how lucene's sorting works. From
what I can tell, a search sorted by the score and a field of my
choosing.. uses the FieldSortedHitQueue to decide what order they
should be in.. am i correct? it uses a priority queue and overrides
the lessThan function to do some ordering, but I'm not sure if thats
exactly where this happens for my sorting or what else exactly is
going on....  I'm sure there is an aspect related to the collators..
but I cannot seem to understand it. At the time of searching and all
through the process, the Sort object and the elements created from it
does have both the  <score> and "rank_field"  items.. so obviously
they are there and active, but I'm not too sure why the rank isn't
looked at...

Thanks in advance!

-Chris


On Thu, 14 Oct 2004 13:53:22 -0700, Chris Fraschetti
<fraschetti@gmail.com> wrote:
> currently rank is indexed as Field.Keyword("rank_field", somerank);
> 
> all ranks will be 0-100 .... so i format them as  0###   to make
> lucene sort them correctly. For my testing purposes, I confine the
> range to 0-10.. so all ranks are indexed and retreived as 000-010
> respectively.
> 
> if i create a sort with the sortfield array of score,rank gives me the
> identical results to only sorting on score. BUT a sort of rank only
> does sort by rank.
> 
> -Chris
> 
> On Thu, 14 Oct 2004 16:49:12 -0400, Erik Hatcher
> <erik@ehatchersolutions.com> wrote:
> 
> 
> > On Oct 14, 2004, at 4:40 PM, Chris Fraschetti wrote:
> > > If I print the different hit score for each doc, they are the same..
> > > but the secondary sorting still does not take affect. is it possible
> > > that even though the float returned to me and i print out, internally
> > > has even more precision and they are indeed not the same? I'm getting
> > > ready to edit the lucene source and recompile.. but I just wanted to
> > > make sure that this is indeed my problem.
> >
> > How did you index your rank field?  That has bearing on how sorting
> > works.  What are the values of the rank field like?  Numeric?
> >
> >        Erik
> >
> >
> >
> > >
> > > -Chris
> > >
> > >
> > > On Thu, 14 Oct 2004 12:42:37 -0400, Richard Alpert
> > > <richard.alpert@cnet.com> wrote:
> > >> Oh, yeah, of course.  I'm sorry, I wasn't reading thoroughly enough.
> > >>
> > >> Richard
> > >>
> > >>
> > >>
> > >>
> > >> -----Original Message-----
> > >> From: Chris Fraschetti [mailto:fraschetti@gmail.com]
> > >> Sent: Thursday, October 14, 2004 12:42 PM
> > >> To: Richard Alpert
> > >> Subject: Re: sorting and score ordering
> > >>
> > >> Richard, thank you, but i would need the score to be changed inside
> > >> lucene in order for the sort to work properly... my earlier questions
> > >> was in regards to rounding the score internally in order to allow for
> > >> a secondary sort to take hold when similar results... say to a precion
> > >> of 0.## occur.
> > >>
> > >> Thanks,
> > >> Chris
> > >>
> > >> On Thu, 14 Oct 2004 12:36:15 -0400, Richard Alpert
> > >> <richard.alpert@cnet.com> wrote:
> > >>> Chris, what about something like
> > >>>
> > >>>   score1 = ((float)(int(score*100+0.5)))/100.00
> > >>>
> > >>> 'score1' will equal 'score' rounded to the nearest 0.01.
> > >>>
> > >>> Richard
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> -----Original Message-----
> > >>> From: Chris Fraschetti [mailto:fraschetti@gmail.com]
> > >>> Sent: Thursday, October 14, 2004 12:14 PM
> > >>> To: Lucene Users List
> > >>> Subject: Re: sorting and score ordering
> > >>>
> > >>> Is there a way to generalize the score that lucene is comparing ?
> > >>> I.e.
> > >>> round it off to say a precision of 0.## ? In order to make this sort
> > >>> work better with 'duplicate' values? (by my terms... hits that have
> > >>> scores within within a certain distance of eachother.)
> > >>>
> > >>> -Chris
> > >>>
> > >>> On Thu, 14 Oct 2004 11:27:34 -0400, Erik Hatcher
> > >>> <erik@ehatchersolutions.com> wrote:
> > >>>> On Oct 13, 2004, at 5:40 PM, Chris Fraschetti wrote:
> > >>>>> and finally if i do....
> > >>>>>
> > >>>>> SortField score_sort = ScoreField.FIELD_SCORE;
> > >>>>> SortField rank_sort = new SortField(RANK_FIELD, true);
> > >>>>> SortField[] sort_fields = {score_sort, rank_sort};
> > >>>>> Sort sort = new Sort(sort_fields);
> > >>>>> hits = searcher.search(query, sort);
> > >>>>>
> > >>>>> I get the same results as I did with the score_sort only...
no
> > >>>>> change
> > >>>>> in the ordering of the rank is there... any ideas? It looks
to me
> > >>>>> as
> > >>>>> if it's completely ignoring it.
> > >>>>
> > >>>> This is sorting first by score and then by your rank field.  The
> > >>>> rank
> > >>>> field sort only applies when the scores are the same.  I suspect
> > >>>> you're
> > >>>> getting different scores so you'd never see rank come into play.
> > >>>>
> > >>>> Display the score and rank in your results to see for sure.
> > >>>>
> > >>>>        Erik
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> --------------------------------------------------------------------
> > >>>> -
> > >>>> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > >>>> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> > >>>>
> > >>>>
> > >>>
> > >>> --
> > >>> ___________________________________________________
> > >>> Chris Fraschetti, Student CompSci System Admin
> > >>> University of San Francisco
> > >>> e fraschetti@gmail.com | http://meteora.cs.usfca.edu
> > >>>
> > >>>
> > >>>
> > >>> ---------------------------------------------------------------------
> > >>> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > >>> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> > >>>
> > >>>
> > >>
> > >> --
> > >> ___________________________________________________
> > >> Chris Fraschetti, Student CompSci System Admin
> > >> University of San Francisco
> > >> e fraschetti@gmail.com | http://meteora.cs.usfca.edu
> > >>
> > >
> > >
> > > --
> > > ___________________________________________________
> > > Chris Fraschetti, Student CompSci System Admin
> > > University of San Francisco
> > > e fraschetti@gmail.com | http://meteora.cs.usfca.edu
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
> >
> > ---------------------------------------------------------------------
> >
> >
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
> >
> 
> 
> 
> 
> --
> ___________________________________________________
> Chris Fraschetti, Student CompSci System Admin
> University of San Francisco
> e fraschetti@gmail.com | http://meteora.cs.usfca.edu
> 


-- 
___________________________________________________
Chris Fraschetti, Student CompSci System Admin
University of San Francisco
e fraschetti@gmail.com | http://meteora.cs.usfca.edu

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message