lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Lucene Scoring Behavior
Date Wed, 17 Sep 2003 19:46:48 GMT
Try using IndexSearcher.explain and dump out the contents of what it 
returns either as toString or toHtml (whichever format suits your 
environment best) and see what it has to say.  It'll give you the 
low-down on the factors involved in the score calculation.  I'm 
interested to see what you come up with.

	Erik


On Wednesday, September 17, 2003, at 03:33  PM, Terry Steichen wrote:

> I've run across some puzzling behavior regarding scoring.  I have a 
> set of documents which contain, among others, a date field (whose 
> contents is a string in the YYYYMMDD format).  When I query on the 
> date 20030917 (that is, today), I get 157 hits, all of which have a 
> score of .23000652.  If I use 20030916 (yesterday), I get 197 hits, 
> each of which has a score of .22295427.
>
> So far, all seems logical.  However, when I search for all records for 
> the date 20030915, the first two (of 174 hits) have a score of 1.0, 
> while all the rest of the hits have a score of .03125.  Here is a 
> tabulation of these and a few more queries:
>
> Query Date      Result
> =======        ========================
> 20030917        all have a score of .23000652 (157)
> 20030916        all have a score of .22295427 (197)
> 20030915        first 2 have a 1.0 score, all rest are .03125 (174)
> 20030914        all have a score of .21384604 (264)
> 20030913        first 2 have a 1.0 score, all rest are .03125 (156)
> 20030912        all have a score .2166833 (241)
> 20030911        first 3 have a 1.0 score, all rest are .03125 (244)
> 20030910        all have a score of  .2208193 (211)
>
> I would expect that all the hits would have the same score, and I 
> would expect it to be normalized to 1 (unless, I guess, the top score 
> was less than 1, in which case normalization presumably doesn't > occur).
>
> Does anyone have any ideas as to what might be going on here?  (I'm 
> using the latest CVS sources, obtained this afternoon.)
>
> Regards,
>
> Terry


Mime
View raw message