lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Calculate Term Frequency
Date Tue, 19 Aug 2014 15:25:49 GMT
Hmmm, I'm not at all an expert here, but Solr has a function
query "termfreq" that does what you're doing I think? I wonder
if the code for that function query would be a good place to
copy (or even make use of)? See TermFreqValueSource...

Maybe not helpful at all, but...
Erick

On Tue, Aug 19, 2014 at 7:04 AM, Bianca Pereira <aivykarter@gmail.com> wrote:
> Hi everybody,
>
>   I would like to know your suggestions to calculate Term Frequency in a
> Lucene document. Currently I am using MultiFields.getTermDocsEnum,
> iterating through the DocsEnum 'de' returned and getting the frequency with
> de.freq() for the desired document.
>
>   My solution gives me the result I want but I am having time issues. For
> instance, I want to calculate the term frequency for a given term for N
> documents in a sequence. Then, every time I have a new document I have to
> retrieve exactly the same DocsEnum again and iterate until find the
> document I want. Of course I cannot cache DocsEnum (yes, I did this huge
> mistake) because it is an iterator.
>
>  Do you have any suggestions on how I can get Term Frequency in a fast way?
> The unique suggestion I had up to now was "Do it programatically, don't use
> Lucene". Should be this the solution?
>
>   Thank you.
>
>   Regards,
>   Bianca Pereira

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message