lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Rosher" <rosh...@googlemail.com>
Subject Re: Announcement: Lucene powering Monster job search index (Beta)
Date Fri, 03 Nov 2006 16:01:02 GMT
Hi Peter,

Does this mean you are calculating the euclidean distance twice ... once for
the HitCollecter to filter
'out of range' documents, and then again for the custom Comparator to sort
the returned documents?
especially since the filtering is done outside Lucene?

Regards,
Dan


>Joe,
>
>Fields with numeric values are stored in a separate file as binary values
in
>an internal format. Lucene is unaware of this file and unaware of the range
>expression in the query. The range expression is parsed outside of Lucene
>and used in a custom HitCollector to filter out documents that aren't in
the
>requested range(s). A goal was to do this without having to modify Lucene.
>Our scheme is pretty efficient, but not very general purpose in its current
>form, though.
>
>Peter
>
>
>On 10/30/06, Joe Shaw <joeshaw@novell.com> wrote:
>>
>> Hi Peter,
>>
>> On Fri, 2006-10-27 at 15:29 -0400, Peter Keegan wrote:
>> > Numeric range search is one of Lucene's weak points (performance-wise)
>> so we
>> > have implemented this with a custom HitCollector and an extension to
the
>> > Lucene index files that stores the numeric field values for all
>> documents.
>> >
>> > It is important to point out that this has all been implemented with
the
>> > stock Lucene 2.0 library. No code changes were made to the Lucene core.
>>
>> Can you give some technical details on the extension to the Lucene index
>> files?  How did you do it without making any changes to the Lucene core?
>>
>> Thanks,
>> Joe
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message