lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chuck Williams" <ch...@manawiz.com>
Subject RE: NUMERIC RANGE BOOLEAN
Date Fri, 17 Dec 2004 05:58:27 GMT
Karthik,

RangeQuery expands into a BooleanQuery containing all of the terms in
the index that fall within the range.  By default, BooleanQuery's can
have at most 1,024 terms.  So, if your index has more than 1,024
different prices that fall within your range then you will hit this
exception.  What matters is distinct prices, not multiple items.  E.g.,
it's ok to have 10,000 items at $5 -- that's just one price.  But more
than 1,024 distinct prices is a problem.

You can fix this at least a couple different ways.
  1.  Increase the maximum number of clauses allowed in a BooleanQuery
(see BooleanQuery.maxClauseCount).  Note that this is done at a cost of
performance.
  2.  Restructure your indexed prices and range query to reduce the
number of clauses.  E.g., index dollars and cents as two different
fields.  Then, for a range like $1.33 to $5.27, construct an or of 3
queries:
    a.  $1 and [33 to 99 cents]
    b.  [$2 to $5]
    c.  $5 and [0 to 27 cents]

I don't know about RangeFilter, but look at QueryFilter.  You can use it
with a RangeQuery to implement a range filter.  However, I think you'll
hit the same issue, so Erik may be referring to a new mechanism that is
not in 1.4.3.

Chuck

  > -----Original Message-----
  > From: Karthik N S [mailto:karthik@controlnet.co.in]
  > Sent: Thursday, December 16, 2004 9:38 PM
  > To: Lucene Users List
  > Subject: RE: NUMERIC RANGE BOOLEAN
  > 
  > Hi Erik
  > 
  > Apologies..........
  > 
  > 
  >   Sometimes  I find it hard to understnad the Answer u  reply ....
  > 
  > 
  > 
  >    1) I looked at the the Wiki and similarly padded '0' [ Total
Length =
  > 8 ]
  > at the time of indexing
  > 
  >     so before Indexprocess the values will be   $ 10.25 , $ 0.50 ,$
  > 15.50.....
  > 
  >     After padding and indexing finally [ Used Luke to moniter ] the
  > values
  > were 00000010.25 ,00000000.25,00000015.50
  > 
  > 
  >   2) I did not find the RangeFilter API in Lucene1.4.3 [is it
recently
  > added
  > if so How Do I use the same some code snippets
  >      please ]
  > 
  > 
  > 
  > 
  > with regards
  > Karthik
  > 
  > 
  > 
  > -----Original Message-----
  > From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
  > Sent: Thursday, December 16, 2004 6:55 PM
  > To: Lucene Users List
  > Subject: Re: NUMERIC RANGE BOOLEAN
  > 
  > 
  > On Dec 16, 2004, at 7:17 AM, Karthik N S wrote:
  > > We have to get the All the Hits int the Range ,
  > >
  > >    So  0.99 cents IS & ALWAYS be 0.99 cents  on which we do the
price
  > > Comaprison from consumer point of view .
  > >
  > >
  > > I hope  I have answered u'r Question
  > 
  > 
  > No, in fact, you have not.  If you want to continue to receive my
help
  > here, you need to provide *details*.  You pose often ambiguous and
hard
  > to decipher questions.  Please help us help you by answering the
  > questions we ask precisely.  What are the values (exact string
values)
  > in that field?  Please also read the wiki page on indexing numeric
  > values.
  > 
  > Look at using the new RangeFilter rather than a RangeQuery due to
the
  > noted issues with doing a RangeQuery.
  > 
  > 	Erik
  > 
  > 
  > >
  > >
  > > With regards
  > > Karthik
  > >
  > > -----Original Message-----
  > > From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
  > > Sent: Thursday, December 16, 2004 5:24 PM
  > > To: Lucene Users List
  > > Subject: Re: NUMERIC RANGE BOOLEAN
  > >
  > >
  > > On Dec 16, 2004, at 5:03 AM, Morus Walter wrote:
  > >> Erik Hatcher writes:
  > >>
  > >>> TooManyClauses exception occurs when a query such as a
RangeQuery
  > >>> expands to more than 1024 terms.  I don't see how this could be
the
  > >>> case in the query you provided - are you certain that is the
query
  > >>> that
  > >>> generated the error?
  > >>>
  > >> Why not: the terms might be 00000003 00000003.1 00000003.11 ...
  > >>
  > >> So the question is, how do his terms look like...
  > >
  > > Ah, good point! So, Karthik - what are are the values of those
terms?
  > >
  > > Pragmatically, do you really need to do a range involving the
cents of
  > > a price?
  > >
  > > 	Erik
  > >
  > >
  > >
---------------------------------------------------------------------
  > > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
  > > For additional commands, e-mail:
lucene-user-help@jakarta.apache.org
  > >
  > >
  > >
---------------------------------------------------------------------
  > > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
  > > For additional commands, e-mail:
lucene-user-help@jakarta.apache.org
  > 
  > 
  >
---------------------------------------------------------------------
  > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
  > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
  > 
  > 
  >
---------------------------------------------------------------------
  > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
  > For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message