lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Trieschnigg, R.B. \(Dolf\)" <r.b.trieschn...@ewi.utwente.nl>
Subject Different scoring mechanism
Date Wed, 07 Jun 2006 13:45:28 GMT
Hi,

I am trying to implement an alternative scoring mechanism in Lucene.
A query of multiple terms is represented as a BooleanQuery with one or more Occur.SHOULD clauses.

The scoring for a document is very simple:
- Assign a score for each queryterm. 
   ! If a document does not contain a queryterm this score can be larger or smaller than 0
!
- Sum these scores.

I run into problems in case a document does not contain a query term. See the following sample
results with query "a" or "b".

Document1: x b p j x s n f e p t h q y t b q d x 
Calculated score: -0.9631268
Score should be:
-2.7171288 = Sum of:
  -1.754002 = score for "a", term frequency 0
  -0.9631268 = score for "b", term frequency 2

Document2: r b g a p i h i u x w p f m p m j c m d 
Calculated score: -2.6189766
Score should be:
-2.6189766 = Sum of:
  -1.3109605 = score for "a", term frequency 1; 
  -1.3080161 = score for "b", term frequency 1;

The calculated score for document2 is as I have in mind. However, the score for documment1
is incorrect. The BooleanScorer seems to ignore the score for term "a"; as it doesn't exist
in the document.

Any ideas how to solve this?

Regards,
Dolf






---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message