lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Winston <a...@christianity.com>
Subject Re: Searching Ranges
Date Fri, 08 Nov 2002 21:39:25 GMT
apologizes for replying to myself, but another nice side-effect of this
fix is that it virtually eliminates the potential for an
OutOfMemoryError, which was a problem i encountered on extremely large
fields, over 10000 terms, while i was profiling the RangeQuery class.

i can get into specifics if need be, any thoughts?

alex


 On Fri, 2002-11-08 at 15:54, Alex Winston wrote:
> thanks for the reply, my apologizes for not explaining myself very
> clearly, it has been a long day.
> 
> you expressed exactly our situation, unfortunately this is not an option
> because we want to have multiple ranges for each document as well, 
> there is a possible extension of what you suggested but that is a last
> resort.  kinda crazy i know, but you have to meet requirements :).
> 
> but i also had a thought while i was looking through the lucene code,
> and any comments are welcome.  
> 
> i may be very mistaken because it has been a long day but if you look at
> the current cvs version of RangeQuery it appears that even if a match is
> found it will continue to iterate over terms within a field, and in my
> case it is on the order of thousands.  if i add a break after a match
> has been found it appears as though the search is improved on avg an
> order of magnitude, my math has left me so i cannot be theoretical at
> the moment.  i have unit tested the change on my side and on the lucene
> side and it works.  note: one hard example is that a query went from 20
> seconds to .5 seconds.  any initial thoughts to if there is a case where
> this would not work?
> 
> beginning line 164:
> TermQuery tq = new TermQuery(term);	  // found a match
> tq.setBoost(boost);			   // set the boost
> q.add(tq, false, false);		  // add to q
> break;  // ADDED!
> 
> 
> On Fri, 2002-11-08 at 15:09, Mike Barry wrote:
> > Alex,
> > 
> > It is rather confusing. It sounds like you've indexed
> > a field that that can be between two values (let's say
> > E-J) and then when you have a search term such as G
> > you want the docs containing E-J (or A-H or F-K but not A-H
> > nor A-C nor J-Z)
> > 
> > Just of the top of my head but could you index the upper and
> > lower bounds as separate fields then when you search do a
> > compound query:
> > 
> >      lower_bound:{ - search_term } AND upper_bound:{ search_term - }
> > 
> > just a thought.
> > > -MikeB.
> > 
> > 
> > Alex Winston wrote:
> > 
> > > i was hoping that someone could briefly review my current solution to a
> > > problem that we have encountered to see if anyone could suggest a
> > > possible alternative, because as it stands we have pushed lucene past
> > > its current limits.
> > >
> > > PROBLEM:
> > >
> > > we were wanting to represent a range of values for a particular field
> > > that is searchable over a particular range.
> > >
> > > an example follows for clarification:
> > > we were wanting to store a range of chapters and verses of a book for a
> > > particular document, and in turn search to see if a query range includes
> > > the range that is represented in the index.
> > >
> > > if this is unclear please ask for clarification
> > >
> > > IMPRACTICAL SOLUTION:
> > >
> > > although this solution seems somewhat impractical it is all we could
> > > come up with.
> > >
> > > our solution involved storing each possible range value within the term
> > > which would allow for RangeQuerys to be performed on this particular
> > > field.  for very small ranges this seems somewhat practical after
> > > profiling.  although once the field ranges began to span multiple
> > > chapters and verses, the search times became unreasonable because we
> > > were storing thousands of entries for each representative range.
> > >
> > > i can elaborate on anything that is unclear,
> > > but any thoughts on a possible alternative solution within lucene that
> > > we overlooked would be extremely helpful.
> > > 	
> > >
> > > alex
> > 
> > 
> > 
> > --
> > To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
> > 
> > 
> 


Mime
View raw message