lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Numeric Range Restrictions: Queries vs Filters
Date Tue, 23 Nov 2004 22:02:36 GMT

: Note that I said FilteredQuery, not QueryFilter.

Doh .. right sorry, I confused myself by thinking you were still refering
to your comments 2004-03-29 comparing DateFilter with RangeQuery wrapped
in a QueryFilter.

: I debate (with myself) on whether add-ons that can be done with other
: code is worth adding to Lucene's core.  In this case the utility
: methods are so commonly needed that it makes sense.  But it could be

In particular, having a class of utilities like that in the code base is
usefull, because now the javadocs for classes like RangeQuery and
RangeFilter can refrence them as being neccessary important to ensure that
ranges work the way you expect ... and hopefully fewer people will be
confused in the future.

: I think there needs to be some discussion on what other utility methods
: should be added.  For example, most of the numerics I index are
: positive integers and using a zero-padded is sufficient.  I'd rather
: have clearly recognizable numbers in my fields than some strange
: contortion that requires a conversion process to see.

I'm of two minds, on one hand I think there's no big harm in providing
every concievable utility function known to man so people have their
choice of representation.  On the other hand, I think it would be nice if
Lucene had a much simpler API for dealing with "non-strings" that just did
"the right thing" based on simple expectations -- without the user having
to ask themselves: "Will i ever need negative numbers?  Will I ever need
numbers bigger then 1000?" or to later remember that they padded tis field
to 5 digits and that field to 7 digits.

Having clearly recognized values is something that can (should?) be easily
accomplished by indexing the contorted but lexically sortable value, and
storing the more readable value...

    Document d = /* some doc */;
    Long l = /* some value */;
    Field f1 = Field.UnIndexed("field", l.toString());
    Field f2 = Field.UnStored("field", NumerTools.longToString(l));
    d.add(f1);
    d.add(f2);

(I'm not imagining things right?  that should work, correct?)

What would really be sweet, Is if Lucene had an API that
transparently dealt with all of the major primitive types, both at
indexing time and at query time, so that users ddn't have to pay any
attention to the stringification, or when to Index a different value
then they store...

    Field f = Field.Long("field", l); /* indexes one string, stores the other */
    d.add(f);
    ...
    Query q = new RangeQuery("field", l1, l2); /* knows to use the contorted string */
    ...
    String s = hits.doc(i).getValue("field"); /* returns pretty string */
    Long l = hits.doc(i).getValue("field");   /* returns orriginal Long */

--

-------------------------------------------------------------------
"Oh, you're a tricky one."                        Chris M Hostetter
     -- Trisha Weir                    hossman@rescomp.berkeley.edu


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message