lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Keegan" <peterlkee...@gmail.com>
Subject Re: Announcement: Lucene powering Monster job search index (Beta)
Date Fri, 03 Nov 2006 18:55:04 GMT
Paramasivam,

Take a look at Solr, in particular the DocSetHitCollector class. The
collector simply sets a bit in a BitSet, or saves the docIds in an array
(for low hit counts). Solr's BitSet was optimized (by Yonik, I believe) to
be faster than Java's BitSet, so this HitCollector is very fast. This is
essentially what we are doing for counting.

Peter

On 11/2/06, Paramasivam Srinivasan <sri@indicus.net> wrote:
>
> Hi Peter
>
> When I use the CustomHitCollector, it affect the application performance.
> Also how you accomplish the grouping the results with out affecting
> performance. Also If possible give some code snippet for custome
> hitcollector.
>
> TIA
>
> Sri
>
> "Peter Keegan" <peterlkeegan@gmail.com> wrote in message
> news:e994873a0610300831p1229ad0k69de63cba052bd22@mail.gmail.com...
> > Joe,
> >
> > Fields with numeric values are stored in a separate file as binary
> values
> > in
> > an internal format. Lucene is unaware of this file and unaware of the
> > range
> > expression in the query. The range expression is parsed outside of
> Lucene
> > and used in a custom HitCollector to filter out documents that aren't in
> > the
> > requested range(s). A goal was to do this without having to modify
> Lucene.
> > Our scheme is pretty efficient, but not very general purpose in its
> > current
> > form, though.
> >
> > Peter
> >
> >
> > On 10/30/06, Joe Shaw <joeshaw@novell.com> wrote:
> >>
> >> Hi Peter,
> >>
> >> On Fri, 2006-10-27 at 15:29 -0400, Peter Keegan wrote:
> >> > Numeric range search is one of Lucene's weak points
> (performance-wise)
> >> so we
> >> > have implemented this with a custom HitCollector and an extension to
> >> > the
> >> > Lucene index files that stores the numeric field values for all
> >> documents.
> >> >
> >> > It is important to point out that this has all been implemented with
> >> > the
> >> > stock Lucene 2.0 library. No code changes were made to the Lucene
> core.
> >>
> >> Can you give some technical details on the extension to the Lucene
> index
> >> files?  How did you do it without making any changes to the Lucene
> core?
> >>
> >> Thanks,
> >> Joe
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message