lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Noble Paul നോബിള്‍ नोब्ळ् <noble.p...@gmail.com>
Subject Re: How to handle database replication delay when using DataImportHandler?
Date Fri, 30 Jan 2009 04:01:31 GMT
Yeah that is an option.

On Fri, Jan 30, 2009 at 12:27 AM, Gregg Donovan <greggny3@gmail.com> wrote:
> Noble,
>
> Thanks for the suggestion. The unfortunate thing is that we really don't
> know ahead of time what sort of replication delay we're going to encounter
> -- it could be one millisecond or it could be one hour. So, we end up
> needing to do something like:
>
> For delta-import run N:
> 1. query DB slave for "seconds_behind_master", use this to calculate
> Date(N).
> 2. query DB slave for records updated since Date(N - 1)
>
> I see there are plugin points for EventListener classes (onImportStart,
> onImportEnd). Would those be the right spot to calculate these dates so that
> I could expose them to my custom function at query time?
>
> Thanks.
>
> --Gregg
>
> On Wed, Jan 28, 2009 at 11:20 PM, Noble Paul നോബിള്‍ नोब्ळ्
<
> noble.paul@gmail.com> wrote:
>
>> The problem you are trying to solve is that you cannot use
>> ${dataimporter.last_index_time} as is. you may need something like
>> ${dataimporter.last_index_time} - 3secs
>>
>> am I right?
>>
>> There are no straight ways to do this .
>> 1) you may write your own function say 'lastIndexMinus3Secs' and add
>> them. functions can be plugged in to DIH using a <function
>> name="lastIndexMinus3Secs" class=""foo.Foo/> under the <dataConfig>
>> tag. And you can use it as
>> ${dataimporter.functions.lastIndexMinus3Secs()}
>> this will add to the existing in-built functions
>>
>> http://wiki.apache.org/solr/DataImportHandler#head-5675e913396a42eb7c6c5d3c894ada5dadbb62d7
>>
>> the class must extend org.apache.solr.handler.dataimport.Evaluator
>>
>> we may add a standard function for this too . you can raise an issue
>> --Noble
>>
>>
>>
>> On Thu, Jan 29, 2009 at 6:26 AM, Gregg <greggny3@gmail.com> wrote:
>> > I'd like to use the DataImportHandler running against a slave database
>> that,
>> > at any given time, may be significantly behind the master DB. This can
>> cause
>> > updates to be missed if you use the clock-time as the "last_index_time."
>> > E.g., if the slave catches up to the master between two delta-imports.
>> >
>> > Has anyone run into this? In our non-DIH indexing system we get around
>> this
>> > by either using the slave DB's seconds-behind-master or the max last
>> update
>> > time of the records returned.
>> >
>> > Thanks.
>> >
>> > Gregg
>> >
>>
>>
>>
>> --
>> --Noble Paul
>>
>



-- 
--Noble Paul

Mime
View raw message