lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "MariLuz Elola" <mel...@seinet.es>
Subject Re: OUTOFMEMORY ERROR
Date Thu, 07 Jul 2005 14:28:22 GMT
Excuse, I was wrong again.
I can use IndexReader.... forget the last email :-D

----- Original Message ----- 
From: "MariLuz Elola" <melola@seinet.es>
To: <general@lucene.apache.org>
Sent: Thursday, July 07, 2005 4:16 PM
Subject: Re: OUTOFMEMORY ERROR


> Erik, I have a problem.
> Firstly I have created several IndexWriter.
> One of them has 210.000 documents, and in the future will be IndexWriters 
> with more than millions of documents.
> I need to obtain all the documents.
> I am searching using the query ID:0* because this query returns all the 
> documents.
> Exactly I am getting the metadata ID (hits.doc(start).get(.ID)), I am 
> getting all the IDs of all the documents of a specific IndexWriter.
> I am getting out of memory doing it.
> About maxClauseCount (by default 1024), I am setting this property:
> org.apache.lucene.search.BooleanQuery.maxClauseCount=es.seinet.xtent.searchEngine.lucene.general.Util.MAX_LUCENE_DOCUMENTS;
> You gave me an idea...to use IndexReader instead of IndexSearcher for 
> getting all the documents.
> I think that it is not possible to use IndexReader, because I need the ID, 
> not the phisical files:
>
>      Directory directory = FSDirectory.getDirectory(path false);
>      IndexReader reader = IndexReader.open(directory);
>      for (int i = 0; i < reader.maxDoc(); i++) ............
>
> Moreover "directory" has all the documents of all the IndexWriter.
>
>
>        Mari Luz
>
> ----- Original Message ----- 
> From: "MariLuz Elola" <melola@seinet.es>
> To: <general@lucene.apache.org>
> Sent: Thursday, July 07, 2005 3:40 PM
> Subject: Re: OUTOFMEMORY ERROR
>
>
>> Thanks Erik,
>> I was wrong, exactly the query that throws an OutOfMemory error is ==> 
>> ID:0* -ID:xtent.
>> With the query ID:0* I have tried to reproduce the error, but the 
>> exception doen´t appear.
>> I will use IndexReader instead of IndexSearcher for getting all the 
>> documents. It´s a good idea.
>> Other thing, when the user searchs without using any query, internally I 
>> am creating the next query ==> ID:0* OR NOT ID:xtent. And this query 
>> parsed by QueryParser I am obtaining ID:0* -ID:xtent (traslated ==> ID:0* 
>> AND NOT ID:xtent), isn´t? Is QueryParser working wrong???
>> About maxClauseCount (by default 1024), I am setting this property:
>> org.apache.lucene.search.BooleanQuery.maxClauseCount=es.seinet.xtent.searchEngine.lucene.general.Util.MAX_LUCENE_DOCUMENTS;
>>
>>    Mari Luz
>>
>> ----- Original Message ----- 
>> From: "Erik Hatcher" <erik@ehatchersolutions.com>
>> To: <general@lucene.apache.org>
>> Sent: Thursday, July 07, 2005 2:46 PM
>> Subject: Re: OUTOFMEMORY ERROR
>>
>>
>>
>> On Jul 7, 2005, at 6:02 AM, MariLuz Elola wrote:
>>> The query is ==> ID:0*
>>> This query returns all the documents, exactly 210.000 documents.
>>> If the user doesn´t specify any criterio in the user interface of 
>>> searching, the server searchs all the documents.
>>
>> Doing a prefix query (which ID:0* is) internally builds a
>> BooleanQuery OR'ing all unique terms in the ID field that begin with
>> a "0".  The built in limit is 1,024 clauses in a BooleanQuery.
>>
>> You will need to re-think your approach.  If the goal is to return
>> all documents, then use IndexReader to walk them.  If the goal is to
>> have a general user query expression where ID:0* would be entered you
>> will need to account for that possibility with more system resources
>> and bumping up the BooleanQuery limit or indexing differently so that
>> there are no so many terms being put into the BooleanQuery.  It is
>> difficult to offer specific advice as I'm not sure what your use
>> cases are.
>>
>>     Erik
>>
>>
>>
>>>
>>>    Mari Luz
>>>
>>>
>>>
>>> Untitled Document  ---------------------------------------------------  
>>> Mari Luz Elola  Developer Engineer Caleruega, 67 28033 Madrid (Spain) 
>>> Tel.: +34 91  768 46 58 mailto: 
>>> ola@seinet.es  ---------------------------------------------------  
>>> Privileged/ Confidential Information may be contained in this message 
>>> and is  intended solely for the use of the named addressee(s). Access to 
>>> this e-mail by anyone else is unauthorised. If you are not the  intended 
>>> recipient, any disclosure, copying, distribution or re-use  of the 
>>> information contained in it is prohibited and may be  unlawful. 
>>> Opinions, conclusions and any other information contained  in this 
>>> message that do not relate to the official business of  Seinet shall be 
>>> understood as neither given nor endorsed by it. If  you have received 
>>> this communication in error, please notify us  immediately by replying 
>>> to this mail and deleting it from your  computer. Thank you.
>>> ----- Original Message ----- From: "Erik Hatcher" 
>>> <erik@ehatchersolutions.com>
>>> To: <general@lucene.apache.org>
>>> Sent: Wednesday, July 06, 2005 8:12 PM
>>> Subject: Re: OUTOFMEMORY ERROR
>>>
>>>
>>> We'll need some more details to help.  What query was it?
>>>
>>>     Erik
>>>
>>> On Jul 6, 2005, at 1:22 PM, MariLuz Elola wrote:
>>>
>>>
>>>> Hi, I have a problem when I am trying to search a simple query 
>>>> without sorting into an index with 210.000 documents.
>>>> Executing the query several times I am getting the OutOfMemory error.
>>>> I am creating an IndexSearcher(pathDir) every search.
>>>> I don´t know if it will be necessary to create only one   indexSearcher

>>>> and caching it,
>>>> If I search into an index with only 50.000 documents, the   outofMemory 
>>>> error doen´t appear.
>>>> ------------------------
>>>> ENVIROMENT DESCRIPTION:
>>>> ------------------------
>>>>
>>>> ---SERVER---
>>>> MEMORY 2GB
>>>> APP SERVER Jboss3.2.3
>>>> JAVA_OPTS -Xmx640M -Xms640M
>>>>
>>>> ----LUCENE 1.4.3-------
>>>> INDEX +- 210.000 documents
>>>> EACH DOCUMENT +- 20 fields (metadatas)
>>>> SIZE TEXT DOCUMENT 1k
>>>>
>>>> ------------------------
>>>> ERROR:
>>>> ------------------------
>>>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>>>> java.lang.OutOfMemoryError
>>>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>>>> java.lang.OutOfMemoryError
>>>> 18:52:18,660 ERROR [STDERR] java.rmi.ServerError: Unexpected  Error; 
>>>> nested exception is:
>>>>         java.lang.OutOfMemoryError
>>>> 18:52:18,661 ERROR [STDERR]     at 
>>>> org.jboss.ejb.plugins.LogInterceptor.handleException 
>>>> (LogInterceptor.java:374)
>>>> 18:52:18,661 ERROR [STDERR]     at 
>>>> org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:195)
>>>> 18:52:18,661 ERROR [STDERR]     at 
>>>> org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke 
>>>> (ProxyFactoryFinderInterceptor.java:122)
>>>> 18:52:18,662 ERROR [STDERR]     at 
>>>> org.jboss.ejb.StatelessSessionContainer.internalInvoke 
>>>> (StatelessSessionContainer.java:331)
>>>> 18:52:18,662 ERROR [STDERR]     at org.jboss.ejb.Container.invoke 
>>>> (Container.java:700)
>>>> 18:52:18,662 ERROR [STDERR]     at 
>>>> sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>>>> 18:52:18,662 ERROR [STDERR]     at 
>>>> sun.reflect.DelegatingMethodAccessorImpl.invok
>>>> .
>>>> .
>>>> Exception java.lang.OutOfMemoryError: requested 4 bytes for CMS:   Work 
>>>> queue overflow; try -XX:-CMSParallelRemarkEnabled. Out of  swap  space?
>>>>
>>>>
>>>> Could anybody help me???
>>>>
>>>> Thanks in advance
>>>>
>>>>     Mari Luz
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
> 



Mime
View raw message