lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <s...@elyograg.org>
Subject Re: keeping data consistent between Database and Solr
Date Tue, 15 Mar 2011 13:12:48 GMT
On 3/14/2011 9:38 PM, onlinespending@gmail.com wrote:
> But my main question is, how do I guarantee that data between my Cassandra
> database and Solr index are consistent and up-to-date?

Our MySQL database has two unique indexes.  One is a document ID, 
implemented in MySQL as an autoincrement integer and in Solr as a long.  
The other is what we call a tag id, implemented in MySQL as a varchar 
and Solr as a single lowercased token and serving as Solr's uniqueKey.  
We have an update trigger on the database that updates the document ID 
whenever the database document is updated.

We have a homegrown build system for Solr.  In a nutshell, it keeps 
track of the newest document ID in the Solr Index.  If the DIH 
delta-import fails, it doesn't update the stored ID, which means that on 
the next run, it will try and index those documents again.  Changes to 
the entries in the database are automatically picked up because the 
document ID is newer, but the tag id doesn't change, so the document in 
Solr is overwritten.

Things are actually more complex than I've written, because our index is 
distributed.  Hopefully it can give you some ideas for yours.

Shawn


Mime
View raw message