lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Fraschetti <frasche...@gmail.com>
Subject Re: BooleanQuery - Too Many Clases on date range.
Date Mon, 04 Oct 2004 09:06:34 GMT
Surely some folks out there have used lucene on a large scale and have
had to compensate for this somehow, any other solutions? Morus, thank
you very more for your imput, and I am looking into your solution,
just putting my feelers out there once more.

The lucene API is very limited as to it's descriptions of it's
components, short of digging into the code, is there a good doc
somewhere out there that explains the workins of lucene?


On Mon, 4 Oct 2004 01:57:06 -0700, Chris Fraschetti
<fraschetti@gmail.com> wrote:
> So before I spend a significant amount of time digging into the lucene
> code, how does your experience with lucene give light to my
> situation....  Our current index is pretty huge, and with each
> increase in side i've had, i've experienced a problem like this...
> Without taking up too much of your time.. because obviously this i my
> task, I thought i'd ask you if you'd had any experience with this
> boolean clause nonsense...  of course it can be overcome, but if you
> know a quick hack, awesome, otherwise.. no big, but off to work i go
> :)
> 
> -Fraschetti
> 
> 
> ---------- Forwarded message ----------
> From: Morus Walter <morus.walter@tanto.de>
> Date: Mon, 4 Oct 2004 09:01:50 +0200
> Subject: Re: BooleanQuery - Too Many Clases on date range.
> To: Lucene Users List <lucene-user@jakarta.apache.org>, Chris
> Fraschetti <fraschetti@gmail.com>
> 
> Chris Fraschetti writes:
> > So i decicded to move my epoch date to the  20040608 date which fixed
> > my boolean query problem in regards to my current data size (approx
> > 600,000) ....
> >
> > but now as soon as I do a query like ...      a*
> > I get the boolean error again. Google obviously can handle this query,
> > and I'm pretty sure lucene can handle it.. any ideas? With out
> > without a date dange specified i still get the  TooManyClauses error.
> 
> 
> > I tired cranking the maxclauses up to Integer.MaxInt, but java gave me
> > a out of memory error. Is this b/c the boolean search tried to
> > allocate that many clauses by default or because my query actually
> > needed that many clauses?
> 
> boolean search allocates clauses for all tokens having the prefix or
> matching the wildcard expression.
> 
> > Why does it work on small indexes but not
> > large?
> Because there are fewer tokens starting with a.
> 
> > Is there any way to have the parser create as many clauses as
> > it can and then search with what it has? w/o recompiling the source?
> >
> You need to create your own version of Wildcard- and Prefix-Query
> that takes a maximum term number and ignores further clauses.
> And you need a variant of the query parser that uses these queries.
> 
> This can be done, even without recompiling lucene, but you will have to
> do some programming at the level of lucene queries.
> Shouldn't be hard, since you can use the sources as a starting point.
> 
> I guess this does not exist because the lucene developer decided to prefer
> a query error rather than uncomplete results.
> 
> Morus
> 
> 
> --
> ___________________________________________________
> Chris Fraschetti, Student CompSci System Admin
> University of San Francisco
> e fraschetti@gmail.com | http://meteora.cs.usfca.edu
> 



-- 
___________________________________________________
Chris Fraschetti, Student CompSci System Admin
University of San Francisco
e fraschetti@gmail.com | http://meteora.cs.usfca.edu

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message