lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Sokolov <msoko...@safaribooksonline.com>
Subject Re: Calculate Term Frequency
Date Tue, 19 Aug 2014 16:44:08 GMT
Have you looked into term vectors? I think they should fit your bill 
pretty neatly.  Here's a nice blog post with helpful background info: 
http://blog.jpountz.net/post/41301889664/putting-term-vectors-on-a-diet

-Mike

On 8/19/2014 10:04 AM, Bianca Pereira wrote:
> Hi everybody,
>
>    I would like to know your suggestions to calculate Term Frequency in a
> Lucene document. Currently I am using MultiFields.getTermDocsEnum,
> iterating through the DocsEnum 'de' returned and getting the frequency with
> de.freq() for the desired document.
>
>    My solution gives me the result I want but I am having time issues. For
> instance, I want to calculate the term frequency for a given term for N
> documents in a sequence. Then, every time I have a new document I have to
> retrieve exactly the same DocsEnum again and iterate until find the
> document I want. Of course I cannot cache DocsEnum (yes, I did this huge
> mistake) because it is an iterator.
>
>   Do you have any suggestions on how I can get Term Frequency in a fast way?
> The unique suggestion I had up to now was "Do it programatically, don't use
> Lucene". Should be this the solution?
>
>    Thank you.
>
>    Regards,
>    Bianca Pereira
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message