lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chantal Ackermann <chantal.ackerm...@biomax.de>
Subject OutOfMemoryError
Date Wed, 28 Nov 2001 12:22:51 GMT
hi to all,

please help! I think I mixed my brain up already with this stuff...

I'm trying to index about 29 textfiles where the biggest one is ~700Mb and 
the smallest ~300Mb. I achieved once to run the whole index, with a merge 
factor = 10 and maxMergeDocs=10000. This took more than 35 hours I think 
(don't know exactly) and it didn't use much RAM (though it could have). 
unfortunately I had a call to optimize at the end and while optimization an 
IOException (File to big) occured (while merging).

As I run the program on a multi-processor machine I now changed the code to 
index each file in a single thread and write to one single IndexWriter. the 
merge factor is still at 10. maxMergeDocs is at 1.000.000. I set the maximum 
heap size to 1MB.

I tried to use RAMDirectory (as mentioned in the mailing list) and just use 
IndexWriter.addDocument(). At the moment it seems not to make any difference. 
after a while _all_ the threads exit one after another (not all at once!) 
with an OutOfMemoryError. the priority of all of them is at the minimum.

even if the multithreading doesn't increase performance I would be glad if I 
could just once get it running again.

I would be even happier if someone could give me a hint what would be the 
best way to index this amount of data. (the average size of an entry that 
gets parsed for a Document is about 1Kb.)

thanx for any help!
chantal

--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message