manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject RE: Manifoldcf is "slow"
Date Fri, 08 Nov 2013 08:06:04 GMT
Hi Ronny,

The amount of work that is needed for a recrawl is highly connector
dependent.  Unfortunately the file system connector is one of the worst
because directory entries have to all be rescanned.  A minimal crawl will
cut down on this a lot but won't pick up deletions, so any schedule you
come up with should have full crawls once in a while.

Thanks,
Karl

Sent from my Windows Phone
------------------------------
From: Ronny Heylen
Sent: 11/7/2013 4:34 PM
To: user@manifoldcf.apache.org
Subject: Manifoldcf is "slow"

Hi,
A job is indexing all *.doc* from a shared windows network drive.
That makes 245113 documents.
The job has run last night.
This job has run again tonight and ended successfully in 2 hours and 20
minutes.
But, from these 245113 documents only 30 were modified today.
How is it possible that 140 minutes were necessary to reindex them?
Is it because Manifoldcf recheck the permissions for all documents from the
AD?
Or something else?
Can we speed things up by using "Start minimal"? (What exactly does "start
minimal" mean is a little bit mysterious for us), in that case should we
use "start" once a week to be up to date?
Thanks for the help,
Ronny&Frédéric

Mime
View raw message