lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: Term collection frequency
Date Mon, 22 Jun 2009 09:09:07 GMT
There's IndexReader.docFreq(Term), which returns the number of
documents that the term occurred in (excluding un-merged deletions).

But the global count of how many times a Term occurred across all docs
is not stored.

You'd have to get a TermDocs enum for that Term, iterate through all
docs, and sum up the freq() from each doc, to compute that, I believe.


On Mon, Jun 22, 2009 at 4:55 AM, Murat
Yakici<> wrote:
> Hi,
> As far as I know, there is no public API to get a term's collection
> frequency in Lucene, apart from writing routines with TFV or TermEnum.
> Does Lucene store the number of times a term occur in the index? If yes,
> can someone direct me to the low-level api where I can get such
> information through some extension? If that is not possible, this would
> require a change in the index format I imagine? Which classes I should be
> dealing with and things I should be careful in implementing such a change?
> Cheers,
> Murat Yakici
> Department of Computer & Information Sciences
> University of Strathclyde
> Glasgow, UK
> -------------------------------------------
> The University of Strathclyde is a charitable body, registered in Scotland,
> with registration number SC015263.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message