manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Continuous crawl - add schedule to stop for maintenance window
Date Thu, 27 Aug 2015 10:52:03 GMT
The continuous crawl basically checks documents periodically to determine
whether they've changed, and if they *haven't* changed it doubles the time
until the next check, until it hits a maximum.  So your description of what
it is doing is consistent with documents that haven't changed much or at
all.

The best way to see what is going on in that case is to use the Document
Status or Queue Status reports to find out when individual documents are
next scheduled to be checked.

Karl


On Thu, Aug 27, 2015 at 6:34 AM, Mcguinness, Cathal <
Cathal.Mcguinness@fmr.com.invalid> wrote:

> Hi Karl,
> We are using our own custom connector class which extends JDBCConnector.
> From looking at JDBCConnector I can see that in the processDocuments
> method you are getting the version query, ts.versionQuery and you are
> logging this activity as "external query".
> Our logic is pretty similar but we are logging this activity as "Version
> Query".  So when we say our Version Query is been run, we are really saying
> that processDocuments has not been run.
>
> Regards
> Cathal.
>
> -----Original Message-----
> From: Karl Wright [mailto:daddywri@gmail.com]
> Sent: 26 August 2015 6:07
> To: dev
> Cc: Mcguinness, Cathal; Kennedy, Eoghan
> Subject: Re: Continuous crawl - add schedule to stop for maintenance window
>
> Hi,
>
> There is nothing wrong with your schedule.  ManifoldCF continuous jobs
> execute during the open time windows but have somewhat different lifecycles
> than jobs that just run to completion.
>
> I could clarify further, but I don't know what the activity called "Version
> Query" is.  I don't believe any of our connectors have that activity.  Can
> you clarify what connector you are using, and if it is custom, at what
> point does it log this activity (e.g. addSeedDocuments, processDocuments,
> etc)?
>
> Karl
>
>
> On Wed, Aug 26, 2015 at 12:58 PM, Sathiyanarayanan, Ramanan <
> ramanan.sathiyanarayanan@fmr.com.invalid> wrote:
>
> > Hello Karl Wright,
> >
> >
> >
> > We tried configuring the way you suggested that would put a “pause” on a
> > continuous job during our weekly outage window.
> >
> > It is working partially, meaning the job stop, unwait, continue &
> external
> > query are running as expected after the pause time (180 mins).
> >
> > But the VERSION query is not getting executed. This in-turn does not runs
> > the data query and other downstream processes.
> >
> >
> >
> > Below are the screen shot of our job configuration for the “continuous
> job”
> >
> >
> >
> > 1.       It is configured as “Start when schedule window starts”
> >
> >
> >
> >
> >
> > 2.       Other config related to scheduling.
> >
> >
> >
> >
> >
> > 3.       Logs related to the issue.
> >
> >
> >
> >
> >
> > -----Original Message-----
> > From: Mcguinness, Cathal
> > Sent: Monday, August 24, 2015 4:59 AM
> > To: Kennedy, Eoghan; Sathiyanarayanan, Ramanan
> > Subject: FW: Continuous crawl - add schedule to stop for maintenance
> window
> >
> >
> >
> > FYI..
> >
> >
> >
> > -----Original Message-----
> >
> > From: Karl Wright [mailto:daddywri@gmail.com <daddywri@gmail.com>]
> >
> > Sent: 21 August 2015 6:33
> >
> > To: dev
> >
> > Subject: Re: Continuous crawl - add schedule to stop for maintenance
> window
> >
> >
> >
> > Right.
> >
> >
> >
> > You can set this up with ONE record, if you do some calculations.  The
> >
> > record should begin Monday at midnight, and go for some period of time
> >
> > until Saturday at 10AM.  The number of minutes the job can run is:
> >
> >
> >
> > 5 * 24 * 60 + 10 * 60 minutes
> >
> >
> >
> > Karl
> >
> >
> >
> >
> >
> > On Fri, Aug 21, 2015 at 12:30 PM, Mcguinness, Cathal <
> >
> > Cathal.Mcguinness@fmr.com.invalid> wrote:
> >
> >
> >
> > > Thanks Karl,
> >
> > > I am still a little confused here.
> >
> > > I can add a schedule to run Monday to Friday, which is fine.
> >
> > > But how can I tell it to run on Saturday but only until 10 am? Is it a
> >
> > > matter of specifying the Maximum runtime parameter?
> >
> > >
> >
> > > Ultimately what I want is for the job to stop running at 10am on
> Saturday
> >
> > > and start running again  first thing Monday.
> >
> > >
> >
> > > Regards
> >
> > > Cathal.
> >
> > >
> >
> > > -----Original Message-----
> >
> > > From: Karl Wright [mailto:daddywri@gmail.com <daddywri@gmail.com>]
> >
> > > Sent: 21 August 2015 5:07
> >
> > > To: dev
> >
> > > Subject: Re: Continuous crawl - add schedule to stop for maintenance
> > window
> >
> > >
> >
> > > Hi Cathal,
> >
> > >
> >
> > > You can have multiple schedule records, each defining a different time
> >
> > > window.  It is the cumulative time window that MCF cares about, not
> what
> > is
> >
> > > found in each individual record.
> >
> > >
> >
> > > Karl
> >
> > >
> >
> > >
> >
> > > On Fri, Aug 21, 2015 at 11:54 AM, Mcguinness, Cathal <
> >
> > > Cathal.Mcguinness@fmr.com.invalid> wrote:
> >
> > >
> >
> > > > Hi,
> >
> > > > What is the best way to configure a Continuous crawl job so as it
> will
> >
> > > not
> >
> > > > run within a defined maintenance window.
> >
> > > > This maintenance window is Saturday 10am until midnight Sunday.
> >
> > > > If I set a schedule but omit Sunday this will mean that the job will
> be
> >
> > > > running on Saturday during the maintenance window which is not
> > desirable.
> >
> > > > What is the best approach here?  Or is this type of configuration
> >
> > > > possible? I know we could go with pausing the job, but if possible
> > would
> >
> > > > like the schedule to be in place.
> >
> > > >
> >
> > > >
> >
> > > > Regards
> >
> > > > Cathal.
> >
> > > >
> >
> > > >
> >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message