manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Detailed monitoring of jobs / job stuck
Date Mon, 17 Aug 2015 21:15:48 GMT
If your startup script starts *all* the mcf processes, you can do that.
Otherwise, it would be a bad idea.

Zookeeper is resilient against this problem, so you can also switch to that.

Karl


On Mon, Aug 17, 2015 at 5:09 PM, Roman Šitina <roman@sitina.cz> wrote:

> Thank you very much, that helped!
>
> Is it ok to put lockclean call in our startup script just to make
> sure? And is it worth to go for Zookeeper version?
>
> Thanks again
> Roman
>
> On 17 August 2015 at 22:55, Karl Wright <daddywri@gmail.com> wrote:
> > I would try executing the lock clean procedure.  Shut down all ManifoldCF
> > processes and web applications, then run the LockClean script, then start
> > them back up again.  If you have shut any processes down with kill -9,
> then
> > you may have locks hanging around.
> >
> > Karl
> >
> >
> > On Mon, Aug 17, 2015 at 4:34 PM, Roman Šitina <roman@sitina.cz> wrote:
> >>
> >> It is multiprocess setup with file synchronisation.
> >>
> >> I can see reprioritisation in logs and after a while all I can see are
> >> these logs cycling:
> >>
> >> DEBUG 2015-08-17 20:27:19,980 (Expire stuffer thread) -
> >> org.apache.manifoldcf.crawlerthreads - Expiration stuffer thread woke
> >> up
> >>
> >> DEBUG 2015-08-17 20:27:19,981 (Expire stuffer thread) -
> >> org.apache.manifoldcf.perf - Beginning query to look for documents to
> >> expire
> >>
> >> DEBUG 2015-08-17 20:27:19,981 (Expire stuffer thread) -
> >> org.apache.manifoldcf.perf -  Attempt 1 to expire documents, after 0
> >> ms
> >>
> >> DEBUG 2015-08-17 20:27:19,983 (Expire stuffer thread) -
> >> org.apache.manifoldcf.perf -  Expiring 0 documents
> >>
> >> DEBUG 2015-08-17 20:27:19,984 (Expire stuffer thread) -
> >> org.apache.manifoldcf.crawlerthreads - Expiration stuffer thread:
> >> Found 0 documents to expire
> >>
> >> DEBUG 2015-08-17 20:27:19,996 (Expire stuffer thread) -
> >> org.apache.manifoldcf.crawlerthreads - Expiration stuffer thread woke
> >> up
> >>
> >> DEBUG 2015-08-17 20:27:19,996 (Expire stuffer thread) -
> >> org.apache.manifoldcf.perf - Beginning query to look for documents to
> >> expire
> >>
> >> DEBUG 2015-08-17 20:27:19,997 (Expire stuffer thread) -
> >> org.apache.manifoldcf.perf -  Attempt 1 to expire documents, after 1
> >> ms
> >>
> >> DEBUG 2015-08-17 20:27:19,999 (Expire stuffer thread) -
> >> org.apache.manifoldcf.perf -  Expiring 0 documents
> >>
> >> DEBUG 2015-08-17 20:27:19,999 (Expire stuffer thread) -
> >> org.apache.manifoldcf.crawlerthreads - Expiration stuffer thread:
> >> Found 0 documents to expire
> >>
> >> DEBUG 2015-08-17 20:27:20,077 (Document cleanup stuffer thread) -
> >> org.apache.manifoldcf.crawlerthreads - Document cleanup stuffer thread
> >> woke up
> >>
> >> DEBUG 2015-08-17 20:27:20,077 (Document delete stuffer thread) -
> >> org.apache.manifoldcf.crawlerthreads - Document delete stuffer thread
> >> woke up
> >>
> >> DEBUG 2015-08-17 20:27:20,078 (Document cleanup stuffer thread) -
> >> org.apache.manifoldcf.crawlerthreads - Document cleanup stuffer thread
> >> found nothing to do
> >>
> >> DEBUG 2015-08-17 20:27:20,078 (Document delete stuffer thread) -
> >> org.apache.manifoldcf.crawlerthreads - Document delete stuffer thread
> >> found nothing to do
> >>
> >> DEBUG 2015-08-17 20:27:20,083 (Document delete stuffer thread) -
> >> org.apache.manifoldcf.crawlerthreads - Document delete stuffer thread
> >> woke up
> >>
> >> DEBUG 2015-08-17 20:27:20,083 (Document cleanup stuffer thread) -
> >> org.apache.manifoldcf.crawlerthreads - Document cleanup stuffer thread
> >> woke up
> >>
> >> DEBUG 2015-08-17 20:27:20,084 (Document delete stuffer thread) -
> >> org.apache.manifoldcf.crawlerthreads - Document delete stuffer thread
> >> found nothing to do
> >>
> >> DEBUG 2015-08-17 20:27:20,084 (Document cleanup stuffer thread) -
> >> org.apache.manifoldcf.crawlerthreads - Document cleanup stuffer thread
> >> found nothing to do
> >>
> >> DEBUG 2015-08-17 20:27:21,078 (Document cleanup stuffer thread) -
> >> org.apache.manifoldcf.crawlerthreads - Document cleanup stuffer thread
> >> woke up
> >>
> >>
> >>
> >> On 17 August 2015 at 21:29, Karl Wright <daddywri@gmail.com> wrote:
> >> > 2.1 does do background reprioritization.  If you want to see that
> >> > occurring
> >> > in the log, you would need to add the following in your properties.xml
> >> > file:
> >> >
> >> > <property name="org.apache.manifoldcf.scheduling" value="DEBUG"/>
> >> >
> >> > Can I have more information?  Specifically, is this a multiprocess
> >> > setup?
> >> > and if so, is this zookeeper or file system synchronization?
> >> >
> >> > Karl
> >> >
> >> >
> >> > On Mon, Aug 17, 2015 at 2:57 PM, Roman Šitina <roman@sitina.cz>
> wrote:
> >> >>
> >> >> Hello Karl,
> >> >>
> >> >> thanks for you quick reply!
> >> >>
> >> >> The version is 2.1. I tried to get detailed logging by setting
> >> >> log4j.rootLogger=INFO, MAIN in logging.ini but that did not help -
> >> >> only WARN level was still logging after restart.
> >> >>
> >> >> Roman
> >> >>
> >> >> On 17 August 2015 at 20:35, Karl Wright <daddywri@gmail.com>
wrote:
> >> >> > Hi Roman,
> >> >> >
> >> >> > ManifoldCF needs to reprioritize documents whenever you pause
or
> >> >> > restart
> >> >> > jobs.  For jobs with large numbers of documents, the total amount
> of
> >> >> > work
> >> >> > involved in this is significant.  But, depending on the precise
> >> >> > ManifoldCF
> >> >> > version you are using, the reprioritization typically continues
in
> >> >> > background while MCF runs your job.
> >> >> >
> >> >> > Can you tell me more about what version of MCF you are trying
here?
> >> >> >
> >> >> > Karl
> >> >> >
> >> >> >
> >> >> > On Mon, Aug 17, 2015 at 2:13 PM, Roman Šitina <sitina@gmail.com>
> >> >> > wrote:
> >> >> >>
> >> >> >> Hello,
> >> >> >>
> >> >> >> I have a ManifoldCF setup based on multiprocess-file-example
which
> >> >> >> is
> >> >> >> backed by PostgreSQL.
> >> >> >>
> >> >> >> I have created a connection from Documentum to ElasticSearch
with
> >> >> >> about 300 000 documents. I was able to crawl several thousand
> >> >> >> documents so the connection is working properly.
> >> >> >>
> >> >> >> What I'm not sure about is that when I pause or stop the job
and
> >> >> >> then
> >> >> >> run it again it takes a while and it looks like ManifoldCF
is
> doing
> >> >> >> nothing (30 minutes). After that time I usually try to restart
all
> >> >> >> processes.
> >> >> >>
> >> >> >> I looked at all logs - manifoldcf.log, documentum-registry,
> >> >> >> documentum-server and DFC itself but I can't find any relevant
> >> >> >> information.
> >> >> >>
> >> >> >> Can you help me figuring out what is the best way to monitor
> >> >> >> progress
> >> >> >> of jobs that look to be not progressing?
> >> >> >>
> >> >> >> Thank you very much
> >> >> >> Roman
> >> >> >
> >> >> >
> >> >
> >> >
> >
> >
>

Mime
View raw message