lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Wang" <john.w...@gmail.com>
Subject Re: Suggestions for drill downs
Date Fri, 05 Dec 2008 02:45:35 GMT
Glad to be of help.
Understand that FieldCache lives in a map in the static memory and is keyed
by an IndexReader. So if your reader updates often there might be an issue
of cleaning the map.

This is a question for the Luceners, when you call IndexReader.reopen, how
is FieldCache updated?

-John

On Thu, Dec 4, 2008 at 5:46 PM, Muralidharan V <muralidharan@gmail.com>wrote:

> John,
>
>     Using the FieldCache worked well. Thanks!
>
> -Murali
>
> On Thu, Dec 4, 2008 at 3:10 PM, John Wang <john.wang@gmail.com> wrote:
>
> > Easiest way to do this is using the FieldCache. It constructs a
> StringIndex
> > object which gives you very fast lookup to the field value (index) given
> a
> > docid. Create a parallel count array to the lookup array for the
> > StringIndex. Run your HitCollector thru should be fast.
> > Loading FieldCache maybe expensive, but you do it only once when opening
> > the
> > reader.
> >
> > This is one way of implementing Faceted Search with Lucene but there are
> > lotsa things you can do to make this faster.
> >
> > This is of course assuming your Brand field is indexed and not tokenized.
> >
> > One side-benefit is that if you do this, sorting on Brand would be fast
> > because FieldCache would be already loaded.
> >
> > Hope this helps
> >
> > -John
> >
> >
> > On Thu, Dec 4, 2008 at 3:02 PM, Muralidharan V <muralidharan@gmail.com
> > >wrote:
> >
> > > We are evaluating lucene for a product search engine. One requirement
> is
> > > that we be able to suggest the top n brands(the ones with most products
> > in
> > > the result set) for a given search term to further refine the search
> > query.
> > > The brand is stored in a separate field and searches are performed
> > against
> > > product description and brand.
> > >
> > > One option is to use a custom HitCollector to keep track of the brands
> in
> > > the result set but that would require reading the brand field for each
> > doc
> > > that matches the search term. We think this will be an order of
> magnitude
> > > slower. Is there anything else that we can do?
> > >
> > > Thanks,
> > > Murali
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message