Return-Path: Delivered-To: apmail-lucene-general-archive@www.apache.org Received: (qmail 30797 invoked from network); 7 Jul 2005 15:57:29 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 7 Jul 2005 15:57:29 -0000 Received: (qmail 40737 invoked by uid 500); 7 Jul 2005 15:57:28 -0000 Delivered-To: apmail-lucene-general-archive@lucene.apache.org Received: (qmail 40705 invoked by uid 500); 7 Jul 2005 15:57:27 -0000 Mailing-List: contact general-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@lucene.apache.org Delivered-To: mailing list general@lucene.apache.org Received: (qmail 40688 invoked by uid 99); 7 Jul 2005 15:57:27 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Jul 2005 08:57:27 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [69.55.225.129] (HELO ehatchersolutions.com) (69.55.225.129) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Jul 2005 08:54:31 -0700 Received: by ehatchersolutions.com (Postfix, from userid 504) id 6CB8F13E2007; Thu, 7 Jul 2005 11:54:14 -0400 (EDT) Received: from [128.143.167.108] (d-128-167-108.bootp.Virginia.EDU [128.143.167.108]) by ehatchersolutions.com (Postfix) with ESMTP id C586B13E2005 for ; Thu, 7 Jul 2005 11:53:44 -0400 (EDT) In-Reply-To: <000801c582f9$65037480$bf00a8c0@akenaton> References: <002b01c5824f$55417e70$bf00a8c0@akenaton> <000f01c582da$fbb260e0$bf00a8c0@akenaton> <000801c582f9$65037480$bf00a8c0@akenaton> Mime-Version: 1.0 (Apple Message framework v730) X-Priority: 3 Content-Type: text/plain; charset=ISO-8859-1; delsp=yes; format=flowed Message-Id: <7CEE2EE9-DEAD-42DD-8300-59C23E521BAA@ehatchersolutions.com> Content-Transfer-Encoding: quoted-printable From: Erik Hatcher Subject: Re: OUTOFMEMORY ERROR Date: Thu, 7 Jul 2005 11:53:46 -0400 To: general@lucene.apache.org X-Mailer: Apple Mail (2.730) X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on javelina X-Spam-Level: X-Virus-Checked: Checked by ClamAV on apache.org X-Old-Spam-Status: No, score=-3.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.0.1 X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N On Jul 7, 2005, at 9:40 AM, MariLuz Elola wrote: > Thanks Erik, > I was wrong, exactly the query that throws an OutOfMemory error is =20 > =3D=3D> ID:0* -ID:xtent. > With the query ID:0* I have tried to reproduce the error, but the =20 > exception doen=B4t appear. > Other thing, when the user searchs without using any query, =20 > internally I am creating the next query =3D=3D> ID:0* OR NOT ID:xtent. That's a hairy query. I definitely do not recommend doing something =20 like that with prefix queries. Check out using a Filter for some of =20 this sort of thing also. > And this query parsed by QueryParser I am obtaining ID:0* -ID:xtent =20= > (traslated =3D=3D> ID:0* AND NOT ID:xtent), isn=B4t? Is QueryParser =20= > working wrong??? It depends. By default, QueryParser uses OR as the default operator. > About maxClauseCount (by default 1024), I am setting this property: > org.apache.lucene.search.BooleanQuery.maxClauseCount=3Des.seinet.xtent.s= =20 > earchEngine.lucene.general.Util.MAX_LUCENE_DOCUMENTS; Bumping up that limit is not necessarily the best thing to do - I =20 recommend changing your approach to querying all documents rather =20 than trying to make BooleanQuery happy with an enormously inefficient =20= query. Erik > > Mari Luz > > ----- Original Message ----- From: "Erik Hatcher" =20 > > To: > Sent: Thursday, July 07, 2005 2:46 PM > Subject: Re: OUTOFMEMORY ERROR > > > > On Jul 7, 2005, at 6:02 AM, MariLuz Elola wrote: > >> The query is =3D=3D> ID:0* >> This query returns all the documents, exactly 210.000 documents. >> If the user doesn=B4t specify any criterio in the user interface of =20= >> searching, the server searchs all the documents. >> > > Doing a prefix query (which ID:0* is) internally builds a > BooleanQuery OR'ing all unique terms in the ID field that begin with > a "0". The built in limit is 1,024 clauses in a BooleanQuery. > > You will need to re-think your approach. If the goal is to return > all documents, then use IndexReader to walk them. If the goal is to > have a general user query expression where ID:0* would be entered you > will need to account for that possibility with more system resources > and bumping up the BooleanQuery limit or indexing differently so that > there are no so many terms being put into the BooleanQuery. It is > difficult to offer specific advice as I'm not sure what your use > cases are. > > Erik > > > > >> >> Mari Luz >> >> >> >> Untitled Document =20 >> --------------------------------------------------- Mari Luz =20 >> Elola Developer Engineer Caleruega, 67 28033 Madrid (Spain) Tel.: =20= >> +34 91 768 46 58 mailto: elola@seinet.es =20 >> --------------------------------------------------- Privileged/ =20 >> Confidential Information may be contained in this message and is =20 >> intended solely for the use of the named addressee(s). Access to =20 >> this e-mail by anyone else is unauthorised. If you are not the =20 >> intended recipient, any disclosure, copying, distribution or re-=20 >> use of the information contained in it is prohibited and may be =20 >> unlawful. Opinions, conclusions and any other information =20 >> contained in this message that do not relate to the official =20 >> business of Seinet shall be understood as neither given nor =20 >> endorsed by it. If you have received this communication in error, =20= >> please notify us immediately by replying to this mail and =20 >> deleting it from your computer. Thank you. >> ----- Original Message ----- From: "Erik Hatcher" =20 >> >> To: >> Sent: Wednesday, July 06, 2005 8:12 PM >> Subject: Re: OUTOFMEMORY ERROR >> >> >> We'll need some more details to help. What query was it? >> >> Erik >> >> On Jul 6, 2005, at 1:22 PM, MariLuz Elola wrote: >> >> >> >>> Hi, I have a problem when I am trying to search a simple query =20 >>> without sorting into an index with 210.000 documents. >>> Executing the query several times I am getting the OutOfMemory =20 >>> error. >>> I am creating an IndexSearcher(pathDir) every search. >>> I don=B4t know if it will be necessary to create only one =20 >>> indexSearcher and caching it, >>> If I search into an index with only 50.000 documents, the =20 >>> outofMemory error doen=B4t appear. >>> ------------------------ >>> ENVIROMENT DESCRIPTION: >>> ------------------------ >>> >>> ---SERVER--- >>> MEMORY 2GB >>> APP SERVER Jboss3.2.3 >>> JAVA_OPTS -Xmx640M -Xms640M >>> >>> ----LUCENE 1.4.3------- >>> INDEX +- 210.000 documents >>> EACH DOCUMENT +- 20 fields (metadatas) >>> SIZE TEXT DOCUMENT 1k >>> >>> ------------------------ >>> ERROR: >>> ------------------------ >>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error: >>> java.lang.OutOfMemoryError >>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error: >>> java.lang.OutOfMemoryError >>> 18:52:18,660 ERROR [STDERR] java.rmi.ServerError: Unexpected =20 >>> Error; nested exception is: >>> java.lang.OutOfMemoryError >>> 18:52:18,661 ERROR [STDERR] at =20 >>> org.jboss.ejb.plugins.LogInterceptor.handleException =20 >>> (LogInterceptor.java:374) >>> 18:52:18,661 ERROR [STDERR] at =20 >>> org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:195) >>> 18:52:18,661 ERROR [STDERR] at =20 >>> org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke =20 >>> (ProxyFactoryFinderInterceptor.java:122) >>> 18:52:18,662 ERROR [STDERR] at =20 >>> org.jboss.ejb.StatelessSessionContainer.internalInvoke =20 >>> (StatelessSessionContainer.java:331) >>> 18:52:18,662 ERROR [STDERR] at org.jboss.ejb.Container.invoke =20= >>> (Container.java:700) >>> 18:52:18,662 ERROR [STDERR] at =20 >>> sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source) >>> 18:52:18,662 ERROR [STDERR] at =20 >>> sun.reflect.DelegatingMethodAccessorImpl.invok >>> . >>> . >>> Exception java.lang.OutOfMemoryError: requested 4 bytes for =20 >>> CMS: Work queue overflow; try -XX:-CMSParallelRemarkEnabled. =20 >>> Out of swap space? >>> >>> >>> Could anybody help me??? >>> >>> Thanks in advance >>> >>> Mari Luz >>> >>> >>> >>> >>> >>> >>> >> >> >> > >