lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sharma, Siddharth" <Siddharth.Sha...@Staples.com>
Subject RE: Too many clauses
Date Mon, 17 Oct 2005 20:36:58 GMT
I thought of that but I had that listed as a last fallback option because I
was not sure what it meant in terms of performance since I am a newbie to
Lucene.
So if I bump up my heap (I assume that's what you are referring to when you
say java pool) it'll be ok?
Are there metrics around this? 
At x max_clauses, jvm heap should be y meg
At x + 1024, it should be z meg




-----Original Message-----
From: Aigner, Thomas [mailto:TAigner@WescoDist.com] 
Sent: Monday, October 17, 2005 3:42 PM
To: java-user@lucene.apache.org
Subject: RE: Too many clauses

Another way around it is to increase the max clause count.

//Setting the clause Count
 BooleanQuery.setMaxClauseCount(int);

Can use maxint or some number smaller.. When I set this high, I have had
to set the java pool higher for memory as well.

Tom

-----Original Message-----
From: Sharma, Siddharth [mailto:Siddharth.Sharma@Staples.com] 
Sent: Monday, October 17, 2005 3:32 PM
To: java-user@lucene.apache.org
Subject: Too many clauses

Query:  caught a class org.apache.lucene.queryParser.ParseException
 with message: Too many boolean clauses

I realize why this is happening (the 1024 clauses limit for
BooleanQuery).
My question is more design related.

During customer registration, the customer defines a set of
skus/products
that we should never display to them. These products are part of our
catalog
offering but we are forbidden to make them available to this customer.
This
list is called the block list and can potentially be large (6 to 7
thousand).

When a customer logs in, this block list is identified and currently I
am
using QueryParser to parse these skus to block/exclude. That is why I am
hitting against the 1024 upper bound.

To circumvent it, here are a few options that I have thought of:
1. Chunk it up: 
  a. Create a filter based on a query that has a maximum of 1024. 
  b. Get its bits.
  c. Get the next 1024 blocked skus and create a filter out of it and
get   
     its bits.
  d. AND the two BitSets.
  e. Do this till all blocked skus and other filters are ANDed together
for 
     the final BitSet.

2. Build the block list into the index somehow
  a. My index is based on SKUs, not on customer.
  b. I could add a field in each SKU document that contains the
customer-ids

     who want this SKU blocked.
  c. But this field's value could be very large.

3. Some other obvious way that I am stupid enough not to be able to 
   visualize.

Thanks in advance
Sid





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message