lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <>
Subject Re: Getting OutOfMemoryError: Java heap space in Solr
Date Wed, 09 Jul 2014 15:54:34 GMT
On 7/9/2014 6:02 AM, yuvaraj ponnuswamy wrote:
> Hi,
> I am getting the OutofMemory Error: "java.lang.OutOfMemoryError: Java heap space" often
in production due to the particular Treemap is taking more memory in the JVM.
> When i looked into the config files I am having the entity called UserQryDocument where
i am fetching the data from certain tables.
> Again i have a sub entiry called "UserLocation" where i am using the CachedSqlEntityProcessor
to get the fields from Cache. It seems like it has the total of 2,00,000 records total.
> processor="CachedSqlEntityProcessor" cacheKey="user_pin" cacheLookup="UserQueryDocumentNonAuthor.DocKey">
> Like this i have some other different entity and there also i am using this CachedSqlEntityProcessor
in the sub entity.
> But when i looked into the Heap Dump : java_pid57.hprof i am able to see the TreeMap
is causing the problem.
> But not able to find which entity is causing this issue.I am using the IBM Heap Ananlyser
to look into the Dump.
> Can you please let me know is there any other way we can find out which entity is causing
this issue or any other tool to analyse and debug the Out of Memory Issue to find the exact
entity is causing this issue.
> I have attched the entity in dataconfig.xml and heap Anayser screen shot.

JDBC drivers have a habit of loading the entire resultset into RAM. 
Also, you are using the cached processor ... which will effectively do
the same thing.  With millions of DB rows, this is going to require a
LOT of heap memory.  You'll want to change your JDBC connection so that
it doesn't load the entire result set, and you may also need to turn off
entity caching in Solr.  You didn't mention what database you're using. 
Here's how to fix MySQL and SQL Server so they don't load the entire
result set.  The requirements for another database are likely to be

The best way to make DIH perform well is to use JOIN so that you can get
all your data with one entity and one SELECT query.  Let the database do
all the heavy lifting instead of having Solr send millions of queries. 
GROUP_CONCAT on the SQL side and a regexTransformer 'splitBy' can
sometimes be used to get multiple values into a field.


View raw message