Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 33103 invoked from network); 26 Oct 2004 15:43:18 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 26 Oct 2004 15:43:18 -0000 Received: (qmail 84641 invoked by uid 500); 26 Oct 2004 15:43:10 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 84579 invoked by uid 500); 26 Oct 2004 15:43:09 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 84566 invoked by uid 99); 26 Oct 2004 15:43:09 -0000 X-ASF-Spam-Status: No, hits=0.2 required=10.0 tests=FORGED_RCVD_HELO,HTML_40_50,HTML_MESSAGE,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (hermes.apache.org: domain of Rossen.Angelov@thomsonmedia.com designates 207.241.10.61 as permitted sender) Received: from [207.241.10.61] (HELO mta.thomsonmedia.com) (207.241.10.61) by apache.org (qpsmtpd/0.28) with ESMTP; Tue, 26 Oct 2004 08:43:08 -0700 Received: from tmskoex01.tm.thomsonmedia.com (ip-33.bankinfo.com [207.241.10.33]) by mta.thomsonmedia.com (Switch-3.1.6/Switch-3.1.6) with ESMTP id i9QFh5lk014284 for ; Tue, 26 Oct 2004 10:43:06 -0500 (CDT) Received: by tmskoex01.tm.thomsonmedia.com with Internet Mail Service (5.5.2657.72) id ; Tue, 26 Oct 2004 10:43:05 -0500 Message-ID: <4F8DDDFDAC9A864AAED5BB875129DF4B0690C964@tmskoex01.tm.thomsonmedia.com> From: "Angelov, Rossen" To: "'Lucene Users List'" Subject: RE: BooleanQuery - TooManyClauses Date: Tue, 26 Oct 2004 10:43:05 -0500 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2657.72) Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C4BB72.7BC9CD04" X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N ------_=_NextPart_001_01C4BB72.7BC9CD04 Content-Type: text/plain; charset="iso-8859-1" > >On Oct 25, 2004, at 6:35 PM, Angelov, Rossen wrote: >> Why there is a limit on the number of clauses? and is there any harm in >> setting MaxClauseCount to Integer.MAX_VALUE? > >The harm is in performance and resource utilization. Rather than do >this, though, read on... > >> I'm using a Range Query on a field that represents dates and getting >> BooleanQuery$TooManyClauses exception. >> This is the query - +/article/createddateiso8601:[20030101000000 TO >> 20031231999999] > >Do you really need to do ranges down to that time level? Or are you >really just concerned with date? If you indexed using YYYYMMDD >instead, there would only be a maximum of 365 terms in that range, >whereas you've got zillions (ok, I was too lazy to do the math! But >far more than 1,024). I need to do range searches. They are part of the requirements and even worse, the range can be as big as up to 10 years for now. It will get bigger. I'm indexing using YYYYMMDDHHmmssZ format and as you said there will be more than just 365 terms per year. This number changes every day as new documents are indexed daily. The only limit I can see is the number of documents that were indexed. I guess maxClauseCount can't be more than the indexed documents. >I recommend changing how you index dates, or at least use a different >field for queries that do not need to concern themselves with the >timestamp aspect. What do you mean change how the dates are indexed? By the way this field is indexed as a string. > > Erik > > Ross "This communication is intended solely for the addressee and is confidential and not for third party unauthorized distribution." ------_=_NextPart_001_01C4BB72.7BC9CD04--