lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Winston <a...@christianity.com>
Subject Re: Searching Ranges
Date Fri, 08 Nov 2002 20:54:51 GMT
thanks for the reply, my apologizes for not explaining myself very
clearly, it has been a long day.

you expressed exactly our situation, unfortunately this is not an option
because we want to have multiple ranges for each document as well, 
there is a possible extension of what you suggested but that is a last
resort.  kinda crazy i know, but you have to meet requirements :).

but i also had a thought while i was looking through the lucene code,
and any comments are welcome.  

i may be very mistaken because it has been a long day but if you look at
the current cvs version of RangeQuery it appears that even if a match is
found it will continue to iterate over terms within a field, and in my
case it is on the order of thousands.  if i add a break after a match
has been found it appears as though the search is improved on avg an
order of magnitude, my math has left me so i cannot be theoretical at
the moment.  i have unit tested the change on my side and on the lucene
side and it works.  note: one hard example is that a query went from 20
seconds to .5 seconds.  any initial thoughts to if there is a case where
this would not work?

beginning line 164:
TermQuery tq = new TermQuery(term);	  // found a match
tq.setBoost(boost);			   // set the boost
q.add(tq, false, false);		  // add to q
break;  // ADDED!


On Fri, 2002-11-08 at 15:09, Mike Barry wrote:
> Alex,
> 
> It is rather confusing. It sounds like you've indexed
> a field that that can be between two values (let's say
> E-J) and then when you have a search term such as G
> you want the docs containing E-J (or A-H or F-K but not A-H
> nor A-C nor J-Z)
> 
> Just of the top of my head but could you index the upper and
> lower bounds as separate fields then when you search do a
> compound query:
> 
>      lower_bound:{ - search_term } AND upper_bound:{ search_term - }
> 
> just a thought.
> > -MikeB.
> 
> 
> Alex Winston wrote:
> 
> > i was hoping that someone could briefly review my current solution to a
> > problem that we have encountered to see if anyone could suggest a
> > possible alternative, because as it stands we have pushed lucene past
> > its current limits.
> >
> > PROBLEM:
> >
> > we were wanting to represent a range of values for a particular field
> > that is searchable over a particular range.
> >
> > an example follows for clarification:
> > we were wanting to store a range of chapters and verses of a book for a
> > particular document, and in turn search to see if a query range includes
> > the range that is represented in the index.
> >
> > if this is unclear please ask for clarification
> >
> > IMPRACTICAL SOLUTION:
> >
> > although this solution seems somewhat impractical it is all we could
> > come up with.
> >
> > our solution involved storing each possible range value within the term
> > which would allow for RangeQuerys to be performed on this particular
> > field.  for very small ranges this seems somewhat practical after
> > profiling.  although once the field ranges began to span multiple
> > chapters and verses, the search times became unreasonable because we
> > were storing thousands of entries for each representative range.
> >
> > i can elaborate on anything that is unclear,
> > but any thoughts on a possible alternative solution within lucene that
> > we overlooked would be extremely helpful.
> > 	
> >
> > alex
> 
> 
> 
> --
> To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
> 
> 


Mime
View raw message