Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@apache.org Received: (qmail 46822 invoked from network); 12 Jun 2003 18:28:55 -0000 Received: from exchange.sun.com (192.18.33.10) by daedalus.apache.org with SMTP; 12 Jun 2003 18:28:55 -0000 Received: (qmail 3882 invoked by uid 97); 12 Jun 2003 18:31:12 -0000 Delivered-To: qmlist-jakarta-archive-lucene-user@nagoya.betaversion.org Received: (qmail 3875 invoked from network); 12 Jun 2003 18:31:12 -0000 Received: from daedalus.apache.org (HELO apache.org) (208.185.179.12) by nagoya.betaversion.org with SMTP; 12 Jun 2003 18:31:12 -0000 Received: (qmail 46583 invoked by uid 500); 12 Jun 2003 18:28:52 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 46502 invoked from network); 12 Jun 2003 18:28:49 -0000 Received: from sccrmhc13.attbi.com (204.127.202.64) by daedalus.apache.org with SMTP; 12 Jun 2003 18:28:49 -0000 Received: from lucene.com (12-210-200-74.client.attbi.com[12.210.200.74](untrusted sender)) by attbi.com (sccrmhc13) with SMTP id <20030612182849016001m4rae>; Thu, 12 Jun 2003 18:28:50 +0000 Message-ID: <3EE8C660.9030808@lucene.com> Date: Thu, 12 Jun 2003 11:28:48 -0700 From: Doug Cutting User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030313 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Lucene Users List Subject: Re: OutOfMemoryErrors searching with WildCardQueries References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Konrad Kolosowski wrote: > If the index grows to hundred thousand documents, with users simultaneously > searching indexes for different locales, what is the best way to cup the > memory requirement? Limiting number of terms, or number of terms > containing wild cards, or eliminating wild card searches altogether. This was discussed recently on lucene-dev@jakarta.apache.org in a thread whose subject contains "too many hits - OutOfMemoryError". I checked in a patch which limits the number of terms that a wildcard is permitted to expand into. The default is 1000. If a term expands to more than that then an exception is thrown. Each term that a wildcard expands into requires around 2kB. So this limits each wildcarded query term to 2MB. If you have queries with large numbers of wildcarded terms then you might consider also limiting that. This patch is in the latest version of Lucene in CVS, but not yet in a release. Doug --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org