lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: NumberTools
Date Mon, 21 Mar 2005 18:59:28 GMT
: One annoyance I have run across is the impedance mismatch between
: range queries and sorting.
: If your terms are  indexed as standard numbers, then integer sorting
: is fast, but range queries don't work (for negative values).  If you
: format the terms such that range queries work for any integer, then
: you have to use the slower string (or custom) sorting.
: Is there a way around this besides writing my own custom sorting hit collector?

yeah, this is something that's never really made sense to me, I've tried
digging into the code to understand this a couple of times, but i've never
had much success, maybe my assumptions/understanding is wrong...

   1) lucene stores all fields as Strings
   2) You can construct a "Sort" object with SortField of type "INT"
   3) according to tribal wisdom (and Lucene in Action) sorting by a
      numeric fields caches the numeric value and is more efficient then
      sorting by a string field (in which the string value needs to be cached)

1+2+3 tells me that at some point, when the the search/sort code sees a
SortField of type "INT" (or of type AUTO and the value of that field in
the first doc looks like an INT) that a single pass is done to convert
the string value of hte field from disk into a numeric value for caching
(and sorting).

     So why couldn't a user specified NumberFormat object be used to
     convert that string into an Integer?  Allowing people to format
     their numbers in a way that sorts lexigraphically for Range Filters,
     but still get the good Numeric Sotr efficiency?

I can see in FieldDocSortedHitQueue where the case statement deals with
the various types of SortField, but at that point it's comparing FieldDoc
objects whose fields[i] is expected to allready be an "Integer" object.
where is that "Integer" object parsed from the String value of the field?


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message