lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ajay_gupta <ajay...@gmail.com>
Subject Re: Lucene Indexing out of memory
Date Tue, 02 Mar 2010 13:48:13 GMT

Hi Erick,
I tried setting setRAMBufferSizeMB  as 200-500MB as well but still it goes
OOM error. 
I thought its filebased indexing so memory shouldn't be an issue but you
might be right that when searching it might be using lot of memory ? Is
there way to load documents in chunks or someothere way to make it scalable
?

Thanks in advance
Ajay


Erick Erickson wrote:
> 
> I'm not following this entirely, but these docs may be huge by the
> time you add context for every word in them. You say that you
> "search the existing indices then I get the content and append....".
> So is it possible that after 70K documents your additions become
> so huge that you're blowing up? Have you taken any measurements
> to determine how big the docs get as you index more and more
> of them?
> 
> If the above is off base, have you tried setting
> IndexWriter.setRAMBufferSizeMB?
> 
> HTH
> Erick
> 
> On Tue, Mar 2, 2010 at 8:27 AM, ajay_gupta <ajay978@gmail.com> wrote:
> 
>>
>> Hi,
>> It might be general question though but I couldn't find the answer yet. I
>> have around 90k documents sizing around 350 MB. Each document contains a
>> record which has some text content. For each word in this text I want to
>> store context for that word and index it so I am reading each document
>> and
>> for each word in that document I am appending fixed number of surrounding
>> words. To do that first I search in existing indices if this word already
>> exist and if it is then I get the content and append the new context and
>> update the document. In case no context exist I create a document with
>> fields "word" and "context" and add these two fields with values as word
>> value and context value.
>>
>> I tried this in RAM but after certain no of docs it gave out of memory
>> error
>> so I thought to use FSDirectory method but surprisingly after 70k
>> documents
>> it also gave OOM error. I have enough disk space but still I am getting
>> this
>> error.I am not sure even for disk based indexing why its giving this
>> error.
>> I thought disk based indexing will be slow but atleast it will be
>> scalable.
>> Could someone suggest what could be the issue ?
>>
>> Thanks
>> Ajay
>> --
>> View this message in context:
>> http://old.nabble.com/Lucene-Indexing-out-of-memory-tp27755872p27755872.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
> 
> 

-- 
View this message in context: http://old.nabble.com/Lucene-Indexing-out-of-memory-tp27755872p27756082.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message