lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mihai Caraman <caraman.mi...@gmail.com>
Subject Re: HighFreqTerms for results set
Date Mon, 18 Jul 2011 13:37:44 GMT
Faceted search is for single-term fields, wright? Isn't it bad practice to
apply it for each word in each field in the resulting set?(if it's even
posible)

Again, I want to find the most frequent word in a resulting set. Words are
in fields that contain phrases, not in their own field.

2011/7/18 Manish Bafna <manish.bafna.82@gmail.com>

> Use Facet by that field. It will bring up top words.
>
> On Mon, Jul 18, 2011 at 6:03 PM, Mihai Caraman <caraman.mihai@gmail.com
> >wrote:
>
> > So I looked around and found no viable solution for this problem:
> > How to extract the most frequent terms in the search result set after
> > submitting the query.
> >
> > HighFreqTerms
> > <http://lucene.apache.org/java/3_2_0/api/contrib-misc/index.html>and
> > docFreq
> > <
> >
> http://lucene.apache.org/java/3_2_0/api/core/org/apache/lucene/index/FilterIndexReader.html#docFreq%28org.apache.lucene.index.Term%29
> > >don't
> > do the job for specific documents.
> >
> > - is it plausible to make a vector of resulted docID's and intersect it
> > with
> > each term's posting list in the index? bigger intersection meaning higher
> > frequency.
> >  *because search results could be really custom, this method can't be
> > optimize to intersect only the highest frequency terms for the entire
> > index.
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message