lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <>
Subject [jira] Updated: (LUCENE-1470) Add TrieRangeQuery to contrib
Date Sat, 07 Feb 2009 23:32:59 GMT


Uwe Schindler updated LUCENE-1470:


I modified the proposal TrieUtils a little bit. Maybe this class could get the new NumberUtils
with the extra option to trie encode the values.

 - Added 32bit support
 - Merged the unsigned int/long handling into prefixCode methods. Making the "sortableBits"
public API is not needed^. For TrieRangeFilter the raw bits will not be needed (I need compareable
 - The conversion from doubles and floats was renamed and returns the standard signed long/int:
doubleToSortableLong and floatToSortableInt, Date is removed (as just Date.getTime() can be
used, as everybody knows).
 - Still missing are Long/IntParser for FieldCache and a SortField factory.

It's still untested!

I will implement tomorrow (now its time to go to bed) the TrieRangeFilter in two variants
(one for 32 bit ints another for 64 bit longs). The min/max values are the ints/longs. The
trie coding is also done using the shift value and bit magic. The results of the range split
are then encoded using TrieUtils.xxxToPrefixCode().

> Add TrieRangeQuery to contrib
> -----------------------------
>                 Key: LUCENE-1470
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: contrib/*
>    Affects Versions: 2.4
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 2.9
>         Attachments: fixbuild-LUCENE-1470.patch, fixbuild-LUCENE-1470.patch, LUCENE-1470-readme.patch,
LUCENE-1470.patch, LUCENE-1470.patch, LUCENE-1470.patch, LUCENE-1470.patch, LUCENE-1470.patch,
LUCENE-1470.patch, LUCENE-1470.patch,,
> According to the thread in java-dev (
and, I want to include my fast
numerical range query implementation into lucene contrib-queries.
> I implemented (based on RangeFilter) another approach for faster
> RangeQueries, based on longs stored in index in a special format.
> The idea behind this is to store the longs in different precision in index
> and partition the query range in such a way, that the outer boundaries are
> search using terms from the highest precision, but the center of the search
> Range with lower precision. The implementation stores the longs in 8
> different precisions (using a class called TrieUtils). It also has support
> for Doubles, using the IEEE 754 floating-point "double format" bit layout
> with some bit mappings to make them binary sortable. The approach is used in
> rather big indexes, query times are even on low performance desktop
> computers <<100 ms (!) for very big ranges on indexes with 500000 docs.
> I called this RangeQuery variant and format "TrieRangeRange" query because
> the idea looks like the well-known Trie structures (but it is not identical
> to real tries, but algorithms are related to it).

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message