manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: 2 seconds delay when checking documents
Date Tue, 19 Jul 2016 10:51:44 GMT
Hi Konstantin,

The stuffer thread operates independently of the worker threads.  It wakes
up, and if anything is available, stuffs as much as it can, and when done
sleeps for a short time.  The queue is maintained at at least twice the
number of active worker threads.  See ManifoldCF in Action.

Karl


On Tue, Jul 19, 2016 at 6:33 AM, jetnet <jetnet@gmail.com> wrote:

> hi All,
>
> I've encountered recently an issue with the crawler (JCIFS connector):
> when a jobs gets started, all it's documents are being checked, and this
> process is taking too long. After tuning DEBUG on, I found, that there are
> ~2 seconds delay when processing the document queue:
>
> e.g.:
>
> DEBUG 2016-07-19 10:31:06,484 (Worker thread '61') - JCIFS: Leaving
> wouldFileBeIncluded for 'smb://...
> DEBUG 2016-07-19 10:31:06,484 (Worker thread '61') - Worker thread done
> processing 1 documents
> DEBUG 2016-07-19 10:31:06,484 (Worker thread '61') -  Adding 1453999232278
> to finishList
> DEBUG 2016-07-19 10:31:06,484 (Worker thread '61') -  Adding 1453999232278
> to ingesterCheckList
> DEBUG 2016-07-19 10:31:06,484 (Worker thread '61') - Finishing documents
> {1453999232278 }
> DEBUG 2016-07-19 10:31:06,484 (Worker thread '61') -  Requeueing documents
> due to carrydown {}
> DEBUG 2016-07-19 10:31:06,484 (Worker thread '61') -  Requeuing
> {1453999232278 }
> DEBUG 2016-07-19 10:31:06,484 (Worker thread '61') - Deleting {}
> DEBUG 2016-07-19 10:31:06,484 (Worker thread '61') - Hopcount removal {}
> DEBUG 2016-07-19 10:31:06,484 (Worker thread '61') - Rescanning documents
> {}
> DEBUG 2016-07-19 10:31:06,500 (Stuffer thread) - Stuffer thread: Found 0
> documents to queue
> DEBUG 2016-07-19 10:31:06,750 (Document cleanup stuffer thread) - Document
> cleanup stuffer thread woke up
> DEBUG 2016-07-19 10:31:06,750 (Document delete stuffer thread) - Document
> delete stuffer thread woke up
> DEBUG 2016-07-19 10:31:06,750 (Document cleanup stuffer thread) - Document
> cleanup stuffer thread found nothing to do
> DEBUG 2016-07-19 10:31:06,750 (Document delete stuffer thread) - Document
> delete stuffer thread found nothing to do
> DEBUG 2016-07-19 10:31:07,375 (Set priority thread) - Done reprioritizing
> because no more documents to reprioritize
> DEBUG 2016-07-19 10:31:07,750 (Document cleanup stuffer thread) - Document
> cleanup stuffer thread woke up
> DEBUG 2016-07-19 10:31:07,750 (Document delete stuffer thread) - Document
> delete stuffer thread woke up
> DEBUG 2016-07-19 10:31:07,750 (Document delete stuffer thread) - Document
> delete stuffer thread found nothing to do
> DEBUG 2016-07-19 10:31:07,750 (Document cleanup stuffer thread) - Document
> cleanup stuffer thread found nothing to do
> DEBUG 2016-07-19 10:31:08,500 (Stuffer thread) - Document stuffer thread
> woke up
> DEBUG 2016-07-19 10:31:08,516 (Stuffer thread) - Stuffer thread: Found 2
> documents to queue
> DEBUG 2016-07-19 10:31:08,531 (Stuffer thread) - Document stuffer thread
> woke up
> DEBUG 2016-07-19 10:31:08,531 (Worker thread '62') - Worker thread
> processing documents: 1453999191642
> DEBUG 2016-07-19 10:31:08,531 (Worker thread '57') - Worker thread
> processing documents: 1453999188326
> DEBUG 2016-07-19 10:31:08,531 (Worker thread '62') - Worker thread
> starting document count is 1
> DEBUG 2016-07-19 10:31:08,531 (Worker thread '57') - Worker thread
> starting document count is 1
> DEBUG 2016-07-19 10:31:08,531 (Worker thread '62') - Post-relationship
> document count is 1
> DEBUG 2016-07-19 10:31:08,531 (Worker thread '57') - Post-relationship
> document count is 1
> DEBUG 2016-07-19 10:31:08,531 (Worker thread '62') -  Post-hopcount pruned
> document count is 1
> DEBUG 2016-07-19 10:31:08,531 (Worker thread '57') -  Post-hopcount pruned
> document count is 1
> DEBUG 2016-07-19 10:31:08,531 (Worker thread '62') - Worker thread about
> to process {1453999191642 }
> DEBUG 2016-07-19 10:31:08,531 (Worker thread '57') - Worker thread about
> to process {1453999188326 }
> DEBUG 2016-07-19 10:31:08,531 (Worker thread '62') - JCIFS: Processing
> 'smb://...
>
>
> As one can see, there is a delay between the thread-61 has left the checks
> and next 2 threads have started.
> Any idea why that happens? Why the document stuffer thread does not wake
> up immediately? Or - why the "document queue batch size" is only 2?
>
> P.S. MCF version 2.3
>
> Thanks!
>
> --
> Konstantin
>

Mime
View raw message