manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Gielow <martin.gie...@gmail.com>
Subject Re: Problem with continuous jobs deleting their documents on restart of Agent
Date Mon, 08 Oct 2012 16:06:52 GMT
Hi Karl,

thanks for the lightning-speed reply! :)

On Mon, Oct 8, 2012 at 5:23 PM, Karl Wright <daddywri@gmail.com> wrote:

> Hi Martin,
>
> The behavior you describe is expected only if you are either deleting
> the job, or the job is set to expire old documents after a certain
> time interval (and that interval has transpired).
>
> Can you tell me what your expiration interval is?
>
>
The expiration interval is set to 1440 (minutes, according to the
interface). I also just tried to leave the box empty, so that there should
be no expiration, but the behaviour remained the same.


> Also, when you say "shutting down agents process", can you clarify
> what deployment model you are using?  How are you shutting down this
> process?
>

I am using a slightly modified version of the multiprocess-example with
postgres as the DBMS. To run and shutdown the agents I use the batch files
that are provided with the example (start-agents.bat and stop-agents.bat).
I have also tried to run the agents process from Eclipse to be able to
debug into it and was getting the same results.


> Thanks,
> Karl
>

Regards,
Martin



>
> On Mon, Oct 8, 2012 at 11:18 AM, Martin Gielow <martin.gielow@gmail.com>
> wrote:
> > Hello,
> >
> > I'm using Manifold to crawl several data sources using the Wiki and the
> JDBC
> > connectors. I have set the associated jobs to run continuously so that
> new
> > documents will be added in a timely manner. The problem I am having with
> > this, is that whenever the Agent is stopped and then restarted, the jobs
> > will delete all of their documents (also propagating the deletes to the
> > associated output connection) before turning themselves inactive (which
> they
> > shouldn't as they are set to run continuously).
> >
> > If I then restart the job, in case of the JDBC connection, it is not
> finding
> > any previously added documents and will set itself inactive again. In
> case
> > of the Wiki connection, the documents are also deleted, but are
> successfully
> > reindexed when the job is restartet manually.
> >
> > The only way I found to prevent the jobs from deleting their items in
> this
> > case, was to manually stop the affected jobs before the Agent is stopped
> > (using the abort option) and to restart them after the Agent has been
> > restarted.
> >
> >
> > I am using the 1.0 release of Manifold and couldn't find anything
> regarding
> > this behaviour in either the documentation or the wiki.
> >
> > Is there an obvious flaw with my setup or something I may have missed in
> the
> > configuration?
> >
> > Thanks in advance for any tips!
> >
> > Regards,
> > Martin
>

Mime
View raw message