lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phil brunet" <>
Subject DocumentWriter.writeNorms : the way to compute the normalisation factor
Date Thu, 15 Apr 2004 12:08:11 GMT
Hi to all.

In the DocumentWriter.writeNorms(Document doc, String segment) method  
(Lucene V1.3)
i wonder if there is a special reason to compute the normalisation factor 
base upon the number of tokens contained in the document (using fieldLengths 
array) instead of computing it using the number of positions (filedPositions 

I think in most of case, the difference is not significant.So using 
fieldLengths or using filedPositions are equivallent. But i would like to be 
sure of it.

So, if anybody has an opinion ...



Nota bene:

If i understood correctly, the fieldLength value and the fieldPosition value 
are different for a given document if and only if the document contains at 
least one token with an increment set to 0.

In my case, such a token should not be compted in the normalisation factor. 
cause i need this factor to be exactly in inverse proportion of the number 
OF DIFFERENT tokens (i.e. ignoring those with increment set to 0).

MSN Messenger  : dialoguez en son et en images 
avec vos amis

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message