lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bianca Pereira <>
Subject Calculate Term Frequency
Date Tue, 19 Aug 2014 14:04:11 GMT
Hi everybody,

  I would like to know your suggestions to calculate Term Frequency in a
Lucene document. Currently I am using MultiFields.getTermDocsEnum,
iterating through the DocsEnum 'de' returned and getting the frequency with
de.freq() for the desired document.

  My solution gives me the result I want but I am having time issues. For
instance, I want to calculate the term frequency for a given term for N
documents in a sequence. Then, every time I have a new document I have to
retrieve exactly the same DocsEnum again and iterate until find the
document I want. Of course I cannot cache DocsEnum (yes, I did this huge
mistake) because it is an iterator.

 Do you have any suggestions on how I can get Term Frequency in a fast way?
The unique suggestion I had up to now was "Do it programatically, don't use
Lucene". Should be this the solution?

  Thank you.

  Bianca Pereira

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message