lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: DataImportHandler running out of memory
Date Wed, 25 Jun 2008 01:36:57 GMT
This is a bug in MySQL.  Try setting the Fetch Size the Statement on  
the connection to Integer.MIN_VALUE.

See,137457 amongst a host of other  
discussions on the subject.  Basically, it tries to load all the rows  
into memory, the only alternative is to set the fetch size to  
Integer.MIN_VALUE so that it gets it one row at a time.  I've hit this  
one myself and it isn't caused by the DataImportHandler, but by the  
MySQL JDBC handler.


On Jun 24, 2008, at 8:23 PM, wojtekpia wrote:

> I'm trying to load ~10 million records into Solr using the  
> DataImportHandler.
> I'm running out of memory (java.lang.OutOfMemoryError: Java heap  
> space) as
> soon as I try loading more than about 5 million records.
> Here's my configuration:
> I'm connecting to a SQL Server database using the sqljdbc driver.  
> I've given
> my Solr instance 1.5 GB of memory. I have set the dataSource  
> batchSize to
> 10000. My SQL query is "select top XXX field1, ... from table1". I  
> have
> about 40 fields in my Solr schema.
> I thought the DataImportHandler would stream data from the DB rather  
> than
> loading it all into memory at once. Is that not the case? Any  
> thoughts on
> how to get around this (aside from getting a machine with more  
> memory)?
> -- 
> View this message in context:
> Sent from the Solr - User mailing list archive at

Grant Ingersoll

Lucene Helpful Hints:

View raw message