lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manish Bafna <manish.bafna...@gmail.com>
Subject Re: HighFreqTerms for results set
Date Mon, 18 Jul 2011 12:47:19 GMT
Use Facet by that field. It will bring up top words.

On Mon, Jul 18, 2011 at 6:03 PM, Mihai Caraman <caraman.mihai@gmail.com>wrote:

> So I looked around and found no viable solution for this problem:
> How to extract the most frequent terms in the search result set after
> submitting the query.
>
> HighFreqTerms
> <http://lucene.apache.org/java/3_2_0/api/contrib-misc/index.html>and
> docFreq
> <
> http://lucene.apache.org/java/3_2_0/api/core/org/apache/lucene/index/FilterIndexReader.html#docFreq%28org.apache.lucene.index.Term%29
> >don't
> do the job for specific documents.
>
> - is it plausible to make a vector of resulted docID's and intersect it
> with
> each term's posting list in the index? bigger intersection meaning higher
> frequency.
>  *because search results could be really custom, this method can't be
> optimize to intersect only the highest frequency terms for the entire
> index.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message