lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ajay_gupta <ajay...@gmail.com>
Subject Lucene Indexing out of memory
Date Tue, 02 Mar 2010 13:27:47 GMT

Hi,
It might be general question though but I couldn't find the answer yet. I
have around 90k documents sizing around 350 MB. Each document contains a
record which has some text content. For each word in this text I want to
store context for that word and index it so I am reading each document and
for each word in that document I am appending fixed number of surrounding
words. To do that first I search in existing indices if this word already
exist and if it is then I get the content and append the new context and
update the document. In case no context exist I create a document with
fields "word" and "context" and add these two fields with values as word
value and context value.

I tried this in RAM but after certain no of docs it gave out of memory error
so I thought to use FSDirectory method but surprisingly after 70k documents
it also gave OOM error. I have enough disk space but still I am getting this
error.I am not sure even for disk based indexing why its giving this error.
I thought disk based indexing will be slow but atleast it will be scalable. 
Could someone suggest what could be the issue ?

Thanks
Ajay
-- 
View this message in context: http://old.nabble.com/Lucene-Indexing-out-of-memory-tp27755872p27755872.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message