lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Elschot <paul.elsc...@xs4all.nl>
Subject Re: solution: RangeQuery with floating point numbers
Date Sun, 09 Apr 2006 14:30:20 GMT
On Sunday 09 April 2006 13:53, Nadav Har'El wrote:
> 
> Hi all,
> 
...
> 
> By the way, it is worth repeating the warning that appears everywhere
> (including Lucene In Action): while this sort of trick works, RangeQuery
> is extremely inefficient (and might not even work) when the number of
> different values that are included in the range is very large.
> ConstantScoreRangeQuery is better, but also inefficient. My code allows
> you to use these queries for floating point values, but in no way makes this
> use efficient :-(
> 
> P.S. I wonder if anyone can suggest an alternative implementation for
> mantissaFormat() below, which doesn't require a DecimalFormat
> object creation.
>
... 
> 
> /* Our string representation of a floating point number will look as
>  * follows:
>  *   SEEENN...

One can speed this up by also indexing prefixes as partial numbers in a
separate field, for example:
S
SE
SEE
SEEEN
SEEENN
SEEENNN
and adapt the numeric range search to use only the shortest 
ones needed. Iirc using base 3 for the digits N minimizes the
number of terms used for a range search in this case, but it's
quite a while ago that I did the math for this, and I did not include
the exponent digits E at the time...

Regards,
Paul Elschot

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message