lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philippe Deslauriers <>
Subject RE: Occurence (freq) and ordering
Date Tue, 02 May 2006 14:48:21 GMT
Thanks for the Field.setOmitNorms(true) tip!

Regarding the Similarity implementation I am trying to do, somehow it does
not work.

Here's what I understand:

Scorer implementation uses the method defined in Similarity, to compute
score. (the formula expressed in
html" is implemented in the scorer.  

According to the formula if all my methods return 1, except for tf(freq)
which simply returns freq, all should work.

score(q,d) = SUM (t in q) : tf(freq) * 1 * 1 * 1 * 1 * 1 * 1

By example doc contains 3 times the word "test", and 1 time the word
"example", and the query was looking for both words, the score for the doc
should be 4.

But whatever I do, score is 1.

What I am missing?



-----Original Message-----
From: Chris Hostetter [] 
Sent: Thursday, April 27, 2006 2:21 PM
Subject: Re: Occurence (freq) and ordering

: Upgrading from lucene 1.3 to 1.9.

: We need to order the result in order of occurrences (score of a doc = sum
: occurrences of all Query).

: I am just starting to read on Similarity, weights etc.

You are definitely on the right track with Similarity.  What you want is a
Similarity implimentation where the values returned by most methods are
either 0 or 1, except for the tf(int) and tf(float) which should be an
identify function.

If you *allways* want *every* query to work this way, then you may also
want to look at using the new Field.setOmitNorms(true) option when you
index your documents.  It not only removes the lengthNorm from the scoring
equation, but it can help to reduce the size of your index (which i seem
to recall you were concerned with an another thread)


To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message