lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phil brunet" <philouc...@hotmail.com>
Subject DocumentWriter.writeNorms : the way to compute the normalisation factor
Date Thu, 15 Apr 2004 12:08:11 GMT
Hi to all.

In the DocumentWriter.writeNorms(Document doc, String segment) method  
(Lucene V1.3)
i wonder if there is a special reason to compute the normalisation factor 
base upon the number of tokens contained in the document (using fieldLengths 
array) instead of computing it using the number of positions (filedPositions 
array).

I think in most of case, the difference is not significant.So using 
fieldLengths or using filedPositions are equivallent. But i would like to be 
sure of it.

So, if anybody has an opinion ...

Thanks

Phil

Nota bene:
=======

If i understood correctly, the fieldLength value and the fieldPosition value 
are different for a given document if and only if the document contains at 
least one token with an increment set to 0.

In my case, such a token should not be compted in the normalisation factor. 
cause i need this factor to be exactly in inverse proportion of the number 
OF DIFFERENT tokens (i.e. ignoring those with increment set to 0).

_________________________________________________________________
MSN Messenger http://g.msn.fr/FR1001/866  : dialoguez en son et en images 
avec vos amis


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message