lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lucene-...@jakarta.apache.org
Subject [Jakarta Lucene Wiki] Updated: SearchNumericalFields
Date Wed, 12 May 2004 02:19:04 GMT
   Date: 2004-05-11T19:19:04
   Editor: 150.101.152.16 <>
   Wiki: Jakarta Lucene Wiki
   Page: SearchNumericalFields
   URL: http://wiki.apache.org/jakarta-lucene/SearchNumericalFields

   how to handle -ve numbers, by matt quail

Change Log:

------------------------------------------------------------------------------
@@ -66,3 +66,63 @@
   == For decimals ==
 
   You can use a multiplier to make sure you don't have decimals if they cause problems.(comment
by sv)
+
+ == Handling positive and negitive numbers. ==
+ 
+ If you want a numerical field that may contain positive and negitive numbers, you still
need to format them as strings. What you must ensure is that for any numbers a and b, if a<b
then format(a)<format(b). The problem cases are
+   * when one number is negative and the other is positve
+   * when both are negitive, ie; -200 is less than -1, even though "-100" is lexocographically
'''greater''' than "-2"
+ 
+ The "trick" to handle these problems are:
+   * use a prefix char for positive and negative numbers so that a negative string is always
less than positive. '-' and '0' are suitable for this
+   * you have to "invert" the magnitude of negative numbers
+ 
+ Here is some code for a encode/decoder that does both these things for ints in the range
-10000 to 9999. You could modify it to accept a double so long as you change the FORMAT appropriately.
+ 
+ {{{
+ private static final char NEGATIVE_PREFIX = '-';
+ // NB: NEGATIVE_PREFIX must be < POSITIVE_PREFIX
+ private static final char POSITIVE_PREFIX = '0';
+ public static final int MAX_ALLOWED = 9999;
+ public static final int MIN_ALLOWED = -10000;
+ private static final String FORMAT = "00000";
+     /**
+  * Converts a long to a String suitable for indexing.
+  */
+ public static String encode(int i) {
+     if ((i < MIN_ALLOWED) || (i > MAX_ALLOWED)) {
+         throw new IllegalArgumentException("out of allowed range");
+     }
+         char prefix;
+     if (i < 0) {
+         prefix = NEGATIVE_PREFIX;
+         i = MAX_ALLOWED + i + 1;
+     } else {
+         prefix = POSITIVE_PREFIX;
+     }
+         DecimalFormat fmt = new DecimalFormat(FORMAT);
+     return prefix + fmt.format(i);
+ }
+     /**
+  * Converts a String that was returned by {@link #encode} back to
+  * a long.
+  */
+ public static int decode(String str) {
+         char prefix = str.charAt(0);
+     int i = Integer.parseInt(str.substring(1));
+         if (prefix == POSITIVE_PREFIX) {
+         // nop
+     } else if (prefix == NEGATIVE_PREFIX) {
+         i = i - MAX_ALLOWED - 1;
+     } else {
+         throw new NumberFormatException("string does not begin with the correct prefix");
+     }
+         return i;
+ }
+ }}}
+ 
+ === Handling larger numbers ===
+ 
+ The code for a class for handling all possible long values is here. http://www.mail-archive.com/lucene-dev@jakarta.apache.org/msg04790.html
+ 
+ That code handles some special cases near Long.MIN_VALUE, and uses a large radix so that
the resulting strings are "compressed".

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message