lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ABDOU Samir" <samir.ab...@unine.ch>
Subject RE : Term Collection Frequency?
Date Wed, 04 Aug 2004 18:34:44 GMT
Thanks,

>> What about the frequency of any given term in the whole collection!?

>IndexReader.docFreq(Term t)

this method doesn't give us the collection frequency of the given term
t, but the number of documents in which this term appears. 

Here an example of what I want:

-------------------------------
We have this table for a term T

Doc ID : 0, 1, 2, 3, 4
Frequency : 3, 5, 4, 2, 5  

In which this term appears 3 times in the document 0, 5 times in the
document 1... and so on !

So the collection frequency of this term would be 3+5+4+2+5 = 19

N.B. : calculate this for each term at runtime will be very expensive!
Is it possible to calculate and store this information during indexing? 
-------------------------------


Regards,
Samir

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message