lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <>
Subject [jira] Commented: (LUCENE-1470) Add TrieRangeFilter to contrib
Date Mon, 16 Feb 2009 16:45:00 GMT


Uwe Schindler commented on LUCENE-1470:

Hi Ning,
thanks for suggesting. I was thinking abou that, too. In general an idea, would be to use
32 bit integers or floats, if you do not need  that much accuracy. In this case, the number
of terms is reduced, too.
But it may be a good option, to specify a option, that values are indexed with the most possible
precision and additionally indexed with lower precision values, too. But The precision step
may be dynamic, like:
a) precision step gets bigger for lower precisions
b) after a precision of XXbits no mor lower precisions are generated and queried. This may
be possible to implement by e.g. an array of precision step values that give the splitting
of the whole long/int into different precisions (like 2-2-2-2-8-8-8-8-8-16, so precisie values
use 2 bit precision step, e.g. from shift 0 to 2, but from shift 48 to 64 a step value of
16 is used).


> Add TrieRangeFilter to contrib
> ------------------------------
>                 Key: LUCENE-1470
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: contrib/*
>    Affects Versions: 2.4
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 2.9
>         Attachments: fixbuild-LUCENE-1470.patch, fixbuild-LUCENE-1470.patch, LUCENE-1470-readme.patch,
LUCENE-1470-revamp.patch, LUCENE-1470-revamp.patch, LUCENE-1470-revamp.patch, LUCENE-1470.patch,
LUCENE-1470.patch, LUCENE-1470.patch, LUCENE-1470.patch, LUCENE-1470.patch, LUCENE-1470.patch,
> According to the thread in java-dev (
and, I want to include my fast
numerical range query implementation into lucene contrib-queries.
> I implemented (based on RangeFilter) another approach for faster
> RangeQueries, based on longs stored in index in a special format.
> The idea behind this is to store the longs in different precision in index
> and partition the query range in such a way, that the outer boundaries are
> search using terms from the highest precision, but the center of the search
> Range with lower precision. The implementation stores the longs in 8
> different precisions (using a class called TrieUtils). It also has support
> for Doubles, using the IEEE 754 floating-point "double format" bit layout
> with some bit mappings to make them binary sortable. The approach is used in
> rather big indexes, query times are even on low performance desktop
> computers <<100 ms (!) for very big ranges on indexes with 500000 docs.
> I called this RangeQuery variant and format "TrieRangeRange" query because
> the idea looks like the well-known Trie structures (but it is not identical
> to real tries, but algorithms are related to it).

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message