manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ronny Heylen <securaqbere...@gmail.com>
Subject Error: Repeated service interruptions - failure processing document: Read timed out
Date Wed, 06 Nov 2013 20:17:29 GMT
Hi,
We use Manifoldcf 1.3 and Solr 4.4 to index a shared network drive with
several hundred thousands documents.
Doing only one manifoldcf job to index all the drive was always giving some
kind of error, therefore to better understand where the problem can be, we
made one job to index all *.doc*, another one for *.xls*, another one for
*.pdf ...
Using the help from the list (thanks!) we set the size limit to 100MB and
all jobs succeeds (great) except the one for *.pptx
The message is
Error: Repeated service interruptions - failure processing document: Read
timed out
We don't find any error in the log we have searched: solr.log, ...
Based on some indications found on Internet, we have set the Throttling max
connections setting to 2 (instead of 10) in 3 places:
output connection to SOLR
authority connection to the Active Directory
repository connection to the windows file share
But the problem stays the same.
We have tried on another machine with SOLR 4.5 and Manifoldcf 1.4, same
problem.
We can let run the job for all *.PDF, or all *.DOC*, or all *.XLS* without
problem, but the same message comes always for *.PPTX.
The last time the job stops with the message, it displays (not the same
numbers for each run as the windows drive is changing) 56311 documents,
with 17466 busy and 38847 processed.
As we don't find anything in the log (but probably we don't look at the
correct place), we don't know what to do.
Thanks for your help,
Ronny and Frédéric

Mime
View raw message