lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Winston <a...@christianity.com>
Subject Re: Searching Ranges
Date Fri, 08 Nov 2002 22:40:40 GMT
actually i was mistaken, i thought the tests ran successfully but after
looking again i merely got a BUILD SUCCESSFUL, apparently lucenes build
cannot find JUnitTask out of the box with ant1.5.1.  i have not had any
time to work through the problem.  i will look into it tomorrow, if you
have any thoughts in the meantime let me know.

thanks
alex



On Fri, 2002-11-08 at 16:46, Otis Gospodnetic wrote:
> Hello,
> 
> Did you say that you run 'ant test-unit' and that all tests still pass?
> If so, could you attach a cvs diff -ucN RangeQuery.java?
> 
> Thanks,
> Otis
> 
> 
> --- Alex Winston <alex@christianity.com> wrote:
> > apologizes for replying to myself, but another nice side-effect of
> > this
> > fix is that it virtually eliminates the potential for an
> > OutOfMemoryError, which was a problem i encountered on extremely
> > large
> > fields, over 10000 terms, while i was profiling the RangeQuery class.
> > 
> > i can get into specifics if need be, any thoughts?
> > 
> > alex
> > 
> > 
> >  On Fri, 2002-11-08 at 15:54, Alex Winston wrote:
> > > thanks for the reply, my apologizes for not explaining myself very
> > > clearly, it has been a long day.
> > > 
> > > you expressed exactly our situation, unfortunately this is not an
> > option
> > > because we want to have multiple ranges for each document as well, 
> > > there is a possible extension of what you suggested but that is a
> > last
> > > resort.  kinda crazy i know, but you have to meet requirements :).
> > > 
> > > but i also had a thought while i was looking through the lucene
> > code,
> > > and any comments are welcome.  
> > > 
> > > i may be very mistaken because it has been a long day but if you
> > look at
> > > the current cvs version of RangeQuery it appears that even if a
> > match is
> > > found it will continue to iterate over terms within a field, and in
> > my
> > > case it is on the order of thousands.  if i add a break after a
> > match
> > > has been found it appears as though the search is improved on avg
> > an
> > > order of magnitude, my math has left me so i cannot be theoretical
> > at
> > > the moment.  i have unit tested the change on my side and on the
> > lucene
> > > side and it works.  note: one hard example is that a query went
> > from 20
> > > seconds to .5 seconds.  any initial thoughts to if there is a case
> > where
> > > this would not work?
> > > 
> > > beginning line 164:
> > > TermQuery tq = new TermQuery(term);	  // found a match
> > > tq.setBoost(boost);			   // set the boost
> > > q.add(tq, false, false);		  // add to q
> > > break;  // ADDED!
> > > 
> > > 
> > > On Fri, 2002-11-08 at 15:09, Mike Barry wrote:
> > > > Alex,
> > > > 
> > > > It is rather confusing. It sounds like you've indexed
> > > > a field that that can be between two values (let's say
> > > > E-J) and then when you have a search term such as G
> > > > you want the docs containing E-J (or A-H or F-K but not A-H
> > > > nor A-C nor J-Z)
> > > > 
> > > > Just of the top of my head but could you index the upper and
> > > > lower bounds as separate fields then when you search do a
> > > > compound query:
> > > > 
> > > >      lower_bound:{ - search_term } AND upper_bound:{ search_term
> > - }
> > > > 
> > > > just a thought.
> > > > > -MikeB.
> > > > 
> > > > 
> > > > Alex Winston wrote:
> > > > 
> > > > > i was hoping that someone could briefly review my current
> > solution to a
> > > > > problem that we have encountered to see if anyone could suggest
> > a
> > > > > possible alternative, because as it stands we have pushed
> > lucene past
> > > > > its current limits.
> > > > >
> > > > > PROBLEM:
> > > > >
> > > > > we were wanting to represent a range of values for a particular
> > field
> > > > > that is searchable over a particular range.
> > > > >
> > > > > an example follows for clarification:
> > > > > we were wanting to store a range of chapters and verses of a
> > book for a
> > > > > particular document, and in turn search to see if a query range
> > includes
> > > > > the range that is represented in the index.
> > > > >
> > > > > if this is unclear please ask for clarification
> > > > >
> > > > > IMPRACTICAL SOLUTION:
> > > > >
> > > > > although this solution seems somewhat impractical it is all we
> > could
> > > > > come up with.
> > > > >
> > > > > our solution involved storing each possible range value within
> > the term
> > > > > which would allow for RangeQuerys to be performed on this
> > particular
> > > > > field.  for very small ranges this seems somewhat practical
> > after
> > > > > profiling.  although once the field ranges began to span
> > multiple
> > > > > chapters and verses, the search times became unreasonable
> > because we
> > > > > were storing thousands of entries for each representative
> > range.
> > > > >
> > > > > i can elaborate on anything that is unclear,
> > > > > but any thoughts on a possible alternative solution within
> > lucene that
> > > > > we overlooked would be extremely helpful.
> > > > > 	
> > > > >
> > > > > alex
> > > > 
> > > > 
> > > > 
> > > > --
> > > > To unsubscribe, e-mail:  
> > <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > > > For additional commands, e-mail:
> > <mailto:lucene-user-help@jakarta.apache.org>
> > > > 
> > > > 
> > > 
> > 
> > 
> 
> > ATTACHMENT part 2 application/pgp-signature name=signature.asc
> 
> 
> 
> __________________________________________________
> Do you Yahoo!?
> U2 on LAUNCH - Exclusive greatest hits videos
> http://launch.yahoo.com/u2
> 
> --
> To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
> 
> 


Mime
View raw message