lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Getting Top n term for a given field for a given time period
Date Fri, 24 Apr 2009 17:22:45 GMT
Make a RangeFilter that visits only docs in your time period, then run
a search w/ a custom HitCollector that looks at the source of each doc
and tallies up the results?  For performance, you'll probably need to
load the source using FieldCache (FieldCache.DEFAULT.getStrings(...)).

Or, use Solr's faceted search, or http://www.browseengine.com, or some
other facets impl, to tally up the sources.

Mike

On Tue, Apr 21, 2009 at 6:52 PM, Preetham Kajekar <preetham@cisco.com> wrote:
>
> Hi,
> I have a lucene index which has 20 mil documents. Each document has a
> timestamp field and a source field. I am interested in finding the top n
> sources for a given hour (based on the timestamp). I know we can get the top
> n sources fields easily using the IndexReader API, but was wondering if I
> can get that top n for a certain period of time ?
>
> Thx,
> ~preetham
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message