lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doron Cohen <DOR...@il.ibm.com>
Subject Re: Searching/indexing date/time values or numeric values?
Date Thu, 18 Jan 2007 22:24:28 GMT
Chris Hostetter <hossman_lucene@fucit.org> wrote on 18/01/2007 14:06:05:

>
>
> NumberTools in the lucene java code base can handle any long regardless
of
> size ...

That's true, just that John's suggested to multiply by e.g. 10e6
might exceed Long.MAX_VALUE (before even applying NumberTools).

> if you look at NumberUtils in the Solr code base you'll find lots
> of great utilities for dealing with ints, longs, floats, and doubles.
>
>
> : Date: Wed, 17 Jan 2007 12:15:33 -0800
> : From: Doron Cohen <DORONC@il.ibm.com>
> : Reply-To: java-user@lucene.apache.org
> : To: java-user@lucene.apache.org
> : Subject: Re: Searching/indexing date/time values or numeric values?
> :
> : John Song <delphi329@yahoo.com> wrote on 17/01/2007 11:09:40:
> :
> : > ultimately, everything is text search.  For decimal number, what you
> : > do is to write a customized analyzer which multiple the number by
> : > some factor, round it to a long and then use NumberTools to convert
> : > that into a text string.  Here is what I did for latitude/longitude
> : > search: multiple it by 10e6.
> : >
> :
> : For most cases this would be sufficient.
> :
> : Won't work if multiplying by precision takes beyond Long.MAX.
> : I think an alternative that would work in that case too could
> : be to prefix X with P, where:
> : X == original number.
> : K == number of digits left side of decimal point.
> : P == lexicographically valid representation of K.
> :
> : P is defined as "n0"+D+"n", where D is a one char
> : representation of K, that is: '0', .. '9','a'..'z'.
> :
> : Examples for values of K, prefix P, and result string S:
> : X=.17   K=0 P=n00n S=n00n_.17
> : X=0.170 K=0 P=n00n S=n00n_.17  (no leading/trailing redundant zeros)
> : X=5.8   K=1 P=n01n S=n01n_5.8
> : X=17.66 K=2 P=n02n S=n02n_17.66
> : ...
> : X=123456789.22    K=9  P=n09n  S=n09n_123456789.22
> : X=1234567890.22   K=10 P=n0an  S=n0an_1234567890.22
> : X=12345678901.22  K=11 P=n0bn  S=n0bn_12345678901.22
> : X=123456789012.22 K=12 P=n0cn  S=n0cn_123456789012.22
> :
> : You get the idea - the prefix prevents the sorting error.
> : No special care required for the fraction part.
> :
> : Negative values can work using 'm' instead of 'n' in the
> : prefix, e.g.:
> : X=-123456789.22    K=9  P=n09n  S=m09m_123456789.22
> :
> : This should work for long values as well.
> : I like the fixed length prefix.
> : But code for this would need to be written...
> :
> : > john
> : >
> : > ----- Original Message ----
> : > From: Jiho Han <jhan@InfinityInfo.com>
> : > To: java-user@lucene.apache.org
> : > Sent: Wednesday, January 17, 2007 10:13:47 AM
> : > Subject: Searching/indexing date/time values or numeric values?
> : >
> : > Is there a way to index/search so that a query could be written to
> : > search on a field using arithmetic comparison operators?
> : >
> : > What I mean is if I had a date/time field called CREATEDATE, I would
> : > search for all documents where:
> : >
> : > CREATEDATE > "1/1/2007"
> : >
> : > The above is obvisouly pseudo-query expression.  I did find something
> : > called Range searches on the query syntax documentation page and it
says
> : > the sorting is done lexicographically.  I guess that means it's
sorted
> : > by letter.  I would then need to store all my date/time values in a
> : > format like yyyymmdd hh:mm:ss.
> : > And search, CREATEDATE:[20070101 00:00:00 TO 20070118 00:00:00],
where
> : > the second date/time value is something like midnight tonight.
> : >
> : > But what about a decimal value?  If I have a VERSION field where
values
> : > are like 1.0, 2.5, 11.3, etc.  That wouldn't work.  Because the
values
> : > would be sorted:
> : > 1.0
> : > 11.3
> : > 2.5
> : > In that order.  And if I do VERSION:[1.0 TO 3.0], search would return
> : > all 3 of them.  The only workaround seems to be prepending 0's and
that
> : > would also only work as long as the maximum digits for the interger
part
> : > is known ahead of time.
> : >
> : > Can someone verify/suggest ways to make this work?
> : > Thanks
> : >
> : > Jiho Han
> : > Senior Software Engineer
> : > Infinity Info Systems
> : > The Sales Technology Experts
> : > Tel: 212.563.4400 x6375
> : > Fax: 212.760.0540
> : > jhan@infinityinfo.com
> : > www.infinityinfo.com <http://www.infinityinfo.com/>
> : >
> : >
> : >
> : > ---------------------------------------------------------------------
> : > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> : > For additional commands, e-mail: java-user-help@lucene.apache.org
> : >
> : >
> : >
> : >
> : >
> : >
> : >
> : >
> :
>
____________________________________________________________________________________

> :
> : > Cheap talk?
> : > Check out Yahoo! Messenger's low PC-to-Phone call rates.
> : > http://voice.yahoo.com
> : >
> : > ---------------------------------------------------------------------
> : > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> : > For additional commands, e-mail: java-user-help@lucene.apache.org
> : >
> :
> :
> : ---------------------------------------------------------------------
> : To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> : For additional commands, e-mail: java-user-help@lucene.apache.org
> :
>
>
>
> -Hoss
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message