lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <>
Subject Re: out of memory while indexing one single file
Date Tue, 08 Jun 2004 15:38:33 GMT

I don't know if the author of CLucene is on this list.  You may get
better help on CLucene mailing list or forum on


--- Yue Sun <> wrote:
> Hi,
> First, I am not sure if I should post my question here, since I am
> using 
> CLucene (C++ port of Lucene) to build indexes. Hope someone here
> could 
> help me.
> I am indexing at a solaris machine with 1G memory. I use ram writer
> and 
> fs writer, and write into fs index once a while. Now I am testing to 
> index single input files. While testing on files < 50M, the program 
> works well. While indexing bigger file, it runs out of 1G memory and 
> crashes, whatever I set some parameters such as merge factor and the 
> frequency writing to disk. My input files are in ASN.1 format, each
> with 
> nested entries, and each entry with various number of fields. I index
> every outermost entry as a lucene document, and each data field as a 
> lucene field. So what is different from others, the number of fields 
> indexed in my program is quite big. Some files have more than 1000 
> different field names. There is no problem with max file descriptors.
> For those failed, some lucene documents have more than 40,000 field 
> pairs (duplicate field names with different values). I think it is
> the 
> reason why memory is consumed vastly. One of the failed cases is with
> an 
> input file size: 66M, and crashes after processing about 3800
> documents.
> Is there any way to improve the program to use less memory? Any 
> suggestion would be apprecited!
> Regards,
> Yue Sun
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message