lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <>
Subject Re: Importing data from SQL server to Solr (Event or realtime)
Date Tue, 15 Mar 2016 13:16:24 GMT
On 3/15/2016 2:12 AM, Pascal Ruppert wrote:
> Hi,I'd like to know how the DIH handle to update Solr. Does it update after a specific
amount of time or is there some trigger that activates the DIH every time something is commited
to the RDBMS.
> It would be the best, if there is something like "realtime" synchronization between solr
and our sql-servers.
> Also, how reliable is the DIH? I read Elasticsearch's JDBC plugin has some problems reconstructing
data with too many joins. Are there any issues with solr in that way?

Every modern operating system comes with scheduling capability.  For
UNIX/Linux, it's cron.  For Windows, it's the task scheduler.

Solr itself has no scheduling capability built in ... and with
well-tested and debugged scheduling already available to you, it's not
likely that it ever will.

I use DIH when I need to do a full rebuild on my index.  It has always
been reliable.  I am *indirectly* doing joins -- on the DB server, with
a view.  Solr itself should be unaffected by joins.  If your database
software can handle the join query without problems, DIH should work
with it.

You might need to increase your maxMergeCount setting in solrconfig.xml
(to 6) if you're importing millions of records on a single DIH run, or
you might find that the database connection will time out and close. 
Here's a couple of mailing list messages (both from the same thread)
with some details:


View raw message