lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <>
Subject Re: Concern with using external SQL server for DIH
Date Thu, 06 Dec 2012 06:57:07 GMT
On 12/5/2012 9:42 AM, Spadez wrote:
> I am looking to import entries to my SOLR server by using the DIH,
> connecting to an external postgre SQL server using the JDBC driver. I will
> be importing about 50,000 entries each time.
> Is connecting to an external SQL server for my data unreliable or risky, or
> is it instead perfrectly reasonable?
> My alternative is to export the SQL file on the other server, download the
> SQL file to my SOLR server, import it to my Solr servers copy of postgreSQL
> and then run the DIH on the local database.

I use DIH in situations that require a full reindex.  The MySQL database 
has 78 million records and imports simultaneously to seven Solr shards 
on two servers.  It takes about three hours.

The only instability that we ever noticed was on older Solr versions 
(1.4.x) with a low mergeFactor.  We ran into a situation where Solr was 
doing a lot of simultaneous merges and stopped indexing data long enough 
that the JDBC connection timed out.  We increased our mergeFactor, and 
newer Solr versions have better configuration possibilities, so now we 
have more merging threads.


View raw message