lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <DCutt...@grandcentral.com>
Subject RE: Searching numerical ranges
Date Tue, 19 Feb 2002 22:27:35 GMT
> From: David Elworthy [mailto:dahe@lingomotors.com]
> 
> I want to be able to search on a field which contains a 
> numerical value,
> specifying a range, such as 1-100. If my understanding of Lucene is
> correct, all fields look essentially like strings, so a simple ranhe
> query won't work (after all, searching on the range "a"-"azz" 
> should not
> match "b"). So my plan is to pad up all numbers to a fixed length by
> prefixing them with zeros on both indexing and search, so the 
> range then becomes (e.g.) 000001-000100.

That sounds like a good strategy.

> My one worry is that it will upset the rankings, as number which
> happened to have occurred in more documents will get a lower IDF,
> whereas all number really ought to receive equal treatment. So a
> possible refinement is to include the clause for the number in my
> overall boolean expression, but give it a boost of zero or some small
> number. So it has to match but does not contribute to the relevance

That should work.

Another alternative is to implement a Filter, which does not affect scoring
at all.  This is just a bit vector which contains ones for documents which
should be included and zeros for others.  That's what the date code uses.

Doug

--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message