lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wettin <karl.wet...@gmail.com>
Subject Re: Scoring issue
Date Thu, 27 Nov 2008 06:39:12 GMT
Alex,

if you have length normalization turned on then the length (the number  
of tokens and perhaps even the distance between the tokens) of the  
second document is much greater than the length of the first document.  
The length is the complete number of tokens in the field, i.e. if you  
add more than one field with the same name to a document these will be  
concatenated. This is why the first hit is a better match.

Try the Searcher#explain method for more details:

http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Searcher.html#explain(org.apache.lucene.search.Query,%20int)


    karl

26 nov 2008 kl. 20.22 skrev AlexElba:

>
> Hello ,
> I have two document in my lucene index
>
> Document<stored/uncompressed,indexed<tagId:5117>
> stored/uncompressed<tagName:Wholesale Hot Dog Stand Equipment>
> stored/uncompressed,indexed,tokenized<tagKey:wholesale hot dog stand
> equipment> stored/uncompressed>
>
> Document<stored/uncompressed,indexed<tagId:11274>
> stored/uncompressed<tagName:Hot Dogs>
> stored/uncompressed,indexed,tokenized<tagKey:hot dog meal>
> stored/uncompressed,indexed,tokenized<tagKey:hot dog restaurant>
> stored/uncompressed,indexed,tokenized<tagKey:hotdog>
> stored/uncompressed,indexed,tokenized<tagKey:hot dog>
> stored/uncompressed,indexed,tokenized<tagKey:hot dog dining>
> stored/uncompressed,indexed,tokenized<tagKey:best hotdog>
> stored/uncompressed,indexed,tokenized<tagKey:cuisine hot dog>
> stored/uncompressed,indexed,tokenized<tagKey:hotdog stand>
> stored/uncompressed,indexed,tokenized<tagKey:hotdog restaurant>
> stored/uncompressed,indexed,tokenized<tagKey:hot dog grill>
> stored/uncompressed,indexed,tokenized<tagKey:hot dog cuisine>
> stored/uncompressed,indexed,tokenized<tagKey:hot dog stand>
> stored/uncompressed,indexed,tokenized<tagKey:hot dog menu>
> stored/uncompressed,indexed,tokenized<tagKey:hot dog shop>
> stored/uncompressed,indexed,tokenized<tagKey:hotdog vendor>
> stored/uncompressed,indexed,tokenized<tagKey:hotdog grill>>
>
> and I am  searching for +tagKey:hot +tagKey:dog
>
> which is exact match for 2nd document, but I am getting 1.0 score  
> for first
> document and 0.7 for second one.
>
> I have custom similarity where  lengthNorm is (1.0 / tokenCount)  
> others are
> some consents
>
> why my first document is getting higher score?
> -- 
> View this message in context: http://www.nabble.com/Scoring-issue-tp20707410p20707410.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message