lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: OUTOFMEMORY ERROR
Date Thu, 07 Jul 2005 15:53:46 GMT

On Jul 7, 2005, at 9:40 AM, MariLuz Elola wrote:
> Thanks Erik,
> I was wrong, exactly the query that throws an OutOfMemory error is  
> ==> ID:0* -ID:xtent.
> With the query ID:0* I have tried to reproduce the error, but the  
> exception doen´t appear.

> Other thing, when the user searchs without using any query,  
> internally I am creating the next query ==> ID:0* OR NOT ID:xtent.

That's a hairy query.  I definitely do not recommend doing something  
like that with prefix queries.  Check out using a Filter for some of  
this sort of thing also.

> And this query parsed by QueryParser I am obtaining ID:0* -ID:xtent  
> (traslated ==> ID:0* AND NOT ID:xtent), isn´t? Is QueryParser  
> working wrong???

It depends.  By default, QueryParser uses OR as the default operator.

> About maxClauseCount (by default 1024), I am setting this property:
> org.apache.lucene.search.BooleanQuery.maxClauseCount=es.seinet.xtent.s 
> earchEngine.lucene.general.Util.MAX_LUCENE_DOCUMENTS;

Bumping up that limit is not necessarily the best thing to do - I  
recommend changing your approach to querying all documents rather  
than trying to make BooleanQuery happy with an enormously inefficient  
query.

     Erik


>
>    Mari Luz
>
> ----- Original Message ----- From: "Erik Hatcher"  
> <erik@ehatchersolutions.com>
> To: <general@lucene.apache.org>
> Sent: Thursday, July 07, 2005 2:46 PM
> Subject: Re: OUTOFMEMORY ERROR
>
>
>
> On Jul 7, 2005, at 6:02 AM, MariLuz Elola wrote:
>
>> The query is ==> ID:0*
>> This query returns all the documents, exactly 210.000 documents.
>> If the user doesn´t specify any criterio in the user interface of  
>> searching, the server searchs all the documents.
>>
>
> Doing a prefix query (which ID:0* is) internally builds a
> BooleanQuery OR'ing all unique terms in the ID field that begin with
> a "0".  The built in limit is 1,024 clauses in a BooleanQuery.
>
> You will need to re-think your approach.  If the goal is to return
> all documents, then use IndexReader to walk them.  If the goal is to
> have a general user query expression where ID:0* would be entered you
> will need to account for that possibility with more system resources
> and bumping up the BooleanQuery limit or indexing differently so that
> there are no so many terms being put into the BooleanQuery.  It is
> difficult to offer specific advice as I'm not sure what your use
> cases are.
>
>     Erik
>
>
>
>
>>
>>    Mari Luz
>>
>>
>>
>> Untitled Document   
>> ---------------------------------------------------  Mari Luz  
>> Elola  Developer Engineer Caleruega, 67 28033 Madrid (Spain) Tel.:  
>> +34 91  768 46 58 mailto: elola@seinet.es   
>> ---------------------------------------------------  Privileged/  
>> Confidential Information may be contained in this message and is   
>> intended solely for the use of the named addressee(s). Access to   
>> this e-mail by anyone else is unauthorised. If you are not the   
>> intended recipient, any disclosure, copying, distribution or re- 
>> use  of the information contained in it is prohibited and may be   
>> unlawful. Opinions, conclusions and any other information  
>> contained  in this message that do not relate to the official  
>> business of  Seinet shall be understood as neither given nor  
>> endorsed by it. If  you have received this communication in error,  
>> please notify us  immediately by replying to this mail and  
>> deleting it from your  computer. Thank you.
>> ----- Original Message ----- From: "Erik Hatcher"  
>> <erik@ehatchersolutions.com>
>> To: <general@lucene.apache.org>
>> Sent: Wednesday, July 06, 2005 8:12 PM
>> Subject: Re: OUTOFMEMORY ERROR
>>
>>
>> We'll need some more details to help.  What query was it?
>>
>>     Erik
>>
>> On Jul 6, 2005, at 1:22 PM, MariLuz Elola wrote:
>>
>>
>>
>>> Hi, I have a problem when I am trying to search a simple query    
>>> without sorting into an index with 210.000 documents.
>>> Executing the query several times I am getting the OutOfMemory  
>>> error.
>>> I am creating an IndexSearcher(pathDir) every search.
>>> I don´t know if it will be necessary to create only one    
>>> indexSearcher and caching it,
>>> If I search into an index with only 50.000 documents, the    
>>> outofMemory error doen´t appear.
>>> ------------------------
>>> ENVIROMENT DESCRIPTION:
>>> ------------------------
>>>
>>> ---SERVER---
>>> MEMORY 2GB
>>> APP SERVER Jboss3.2.3
>>> JAVA_OPTS -Xmx640M -Xms640M
>>>
>>> ----LUCENE 1.4.3-------
>>> INDEX +- 210.000 documents
>>> EACH DOCUMENT +- 20 fields (metadatas)
>>> SIZE TEXT DOCUMENT 1k
>>>
>>> ------------------------
>>> ERROR:
>>> ------------------------
>>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>>> java.lang.OutOfMemoryError
>>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>>> java.lang.OutOfMemoryError
>>> 18:52:18,660 ERROR [STDERR] java.rmi.ServerError: Unexpected   
>>> Error; nested exception is:
>>>         java.lang.OutOfMemoryError
>>> 18:52:18,661 ERROR [STDERR]     at  
>>> org.jboss.ejb.plugins.LogInterceptor.handleException  
>>> (LogInterceptor.java:374)
>>> 18:52:18,661 ERROR [STDERR]     at  
>>> org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:195)
>>> 18:52:18,661 ERROR [STDERR]     at  
>>> org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke  
>>> (ProxyFactoryFinderInterceptor.java:122)
>>> 18:52:18,662 ERROR [STDERR]     at  
>>> org.jboss.ejb.StatelessSessionContainer.internalInvoke  
>>> (StatelessSessionContainer.java:331)
>>> 18:52:18,662 ERROR [STDERR]     at org.jboss.ejb.Container.invoke  
>>> (Container.java:700)
>>> 18:52:18,662 ERROR [STDERR]     at  
>>> sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>>> 18:52:18,662 ERROR [STDERR]     at  
>>> sun.reflect.DelegatingMethodAccessorImpl.invok
>>> .
>>> .
>>> Exception java.lang.OutOfMemoryError: requested 4 bytes for  
>>> CMS:   Work queue overflow; try -XX:-CMSParallelRemarkEnabled.  
>>> Out of  swap  space?
>>>
>>>
>>> Could anybody help me???
>>>
>>> Thanks in advance
>>>
>>>     Mari Luz
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>
>


Mime
View raw message