lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Elschot <paul.elsc...@xs4all.nl>
Subject Re: Term numbering and range filtering
Date Tue, 11 Nov 2008 21:14:53 GMT
Op Tuesday 11 November 2008 21:55:45 schreef Michael McCandless:
> Also, one nice optimization we could do with the "term number column-
> stride array" is do bit packing (borrowing from the PFOR code)
> dynamically.
>
> Ie since we know there are X unique terms in this segment, when
> populating the array that maps docID to term number we could use
> exactly the right number of bits.  Enumerated fields with not many
> unique values (eg, country, state) would take relatively little RAM.
> With LUCENE-1231, where the fields are stored column stride on disk,
> we could do this packing during index such that loading at search
> time is very fast.

Perhaps we'd better continue this at LUCENE-1231 or LUCENE-1410.
I think what you're referring to is PDICT, which has frame exceptions
for values that occur infrequently.

Regards,
Paul Elschot


>
> Mike
>
> Paul Elschot wrote:
> > Op Tuesday 11 November 2008 11:29:27 schreef Michael McCandless:
> >> The other part of your proposal was to somehow "number" term text
> >> such that term range comparisons can be implemented fast int
> >> comparison.
> >
> > ...
> >
> >>   http://fontoura.org/papers/paramsearch.pdf
> >>
> >> However that'd be quite a bit deeper change to Lucene.
> >
> > The cheap version is hierarchical prefixing here:
> >
> > http://wiki.apache.org/jakarta-lucene/DateRangeQueries
> >
> > Regards,
> > Paul Elschot
> >
> > -------------------------------------------------------------------
> >-- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message