manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luis Cabaceira <cabace...@gmail.com>
Subject Re: Job error during WindowsShare repository connector indexation
Date Wed, 11 Oct 2017 10:05:10 GMT
>From the look of it, this can be coming from a limitation on the number
file handles. You process can be creating too many file handles and not
closing those in time, eventually preventing further file operations.

I suggest you check this, in Linux run : cat /proc/sys/fs/file-max


To see the hard and soft values :

# ulimit -Hn
# ulimit -Sn

P.S. - Change into the user that is running Manifold first


On 11 October 2017 at 13:54, Olivier Tavard <olivier.tavard@francelabs.com>
wrote:

> Hi,
>
> Thanks for your answer.
> Yes I could reach the samba server from the MCF server. Indeed, the first
> hours after the MCF job was launched, thousands of documents were correctly
> accessed and processed by MCF. The mentioned errors appeared only after few
> hours. Before that, the indexation was done correctly.
>
> Best regards,
> Olivier TAVARD
>
>
> Le 11 oct. 2017 à 11:21, Cihad Guzel <cguzelg@gmail.com> a écrit :
>
> Hi Olivier,
>
> Did you try to connect to samba server with any samba client app? Check
> Iptables on your server. Can you stop iptables on ubuntu server? Maybe, you
> can configure iptables.
>
> Regards,
> Cihad Guzel
>
>
> 2017-10-11 12:02 GMT+03:00 Olivier Tavard <olivier.tavard@francelabs.com>:
>
>> Hi,
>>
>> I had this error during crawling a Samba hosted on Ubuntu Server :
>> ERROR 2017-10-05 00:00:14,109 (Idle cleanup thread) -
>> MCF|MCF-agent|apache.manifoldcf.crawlerthreads|Exception tossed: Service
>> '_ANON_0' of type '_REPOSITORYCONNECTORPOOL_SmbFileShare' is not active
>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Service
>> '_ANON_0' of type '_REPOSITORYCONNECTORPOOL_SmbFileShare' is not active
>> at org.apache.manifoldcf.core.lockmanager.BaseLockManager.updat
>> eServiceData(BaseLockManager.java:273)
>> at org.apache.manifoldcf.core.lockmanager.LockManager.updateSer
>> viceData(LockManager.java:108)
>> at org.apache.manifoldcf.core.connectorpool.ConnectorPool$Pool.
>> pollAll(ConnectorPool.java:654)
>> at org.apache.manifoldcf.core.connectorpool.ConnectorPool.pollA
>> llConnectors(ConnectorPool.java:338)
>> at org.apache.manifoldcf.crawler.repositoryconnectorpool.Reposi
>> toryConnectorPool.pollAllConnectors(RepositoryConnectorPool.java:124)
>> at org.apache.manifoldcf.crawler.system.IdleCleanupThread.run(I
>> dleCleanupThread.java:68)
>>
>> I used MCF 2.8.1 on Debian 8 with Postgresql 9.5.3, Windows Share
>> repository connector. The job was configured to process about 2 millions of
>> files  (600 GB).
>> For text extraction I used a Tika server (on the same server as MCF) and
>> add the Tika external content extractor transformation connector into the
>> job configuration.
>> The error was present 9 hours after the job was launched. The status job
>> still indicated that the job was running but there was only 1 document in
>> the active column and the error above was repeated in the MCF log.
>>
>> Then I tried to launch the clean-lock.sh script and I obtained this error
>> :
>> WARN 2017-10-09 08:23:56,284 (Idle cleanup thread) -
>> MCF|MCF-agent|apache.manifoldcf.lock|Attempt to set file lock
>> 'mcf/mcf_home/./syncharea/551/442/lock-_POOLTARGET__REPOSITO
>> RYCONNECTORPOOL_SmbFileShare.lock' failed: No such file or directory
>> java.io.IOException: No such file or directory
>> at java.io.UnixFileSystem.createFileExclusively(Native Method)
>> at java.io.File.createNewFile(File.java:1012)
>> at org.apache.manifoldcf.core.lockmanager.FileLockObject.grabFi
>> leLock(FileLockObject.java:223)
>> at org.apache.manifoldcf.core.lockmanager.FileLockObject.obtain
>> GlobalWriteLockNoWait(FileLockObject.java:78)
>> at org.apache.manifoldcf.core.lockmanager.LockObject.obtainGlob
>> alWriteLock(LockObject.java:121)
>> at org.apache.manifoldcf.core.lockmanager.LockObject.enterWrite
>> Lock(LockObject.java:74)
>> at org.apache.manifoldcf.core.lockmanager.LockGate.enterWriteLo
>> ck(LockGate.java:177)
>> at org.apache.manifoldcf.core.lockmanager.BaseLockManager.enter
>> Write(BaseLockManager.java:1120)
>> at org.apache.manifoldcf.core.lockmanager.BaseLockManager.enter
>> WriteLock(BaseLockManager.java:757)
>> at org.apache.manifoldcf.core.lockmanager.LockManager.enterWrit
>> eLock(LockManager.java:302)
>> at org.apache.manifoldcf.core.connectorpool.ConnectorPool$Pool.
>> pollAll(ConnectorPool.java:585)
>> at org.apache.manifoldcf.core.connectorpool.ConnectorPool.pollA
>> llConnectors(ConnectorPool.java:338)
>> at org.apache.manifoldcf.crawler.repositoryconnectorpool.Reposi
>> toryConnectorPool.pollAllConnectors(RepositoryConnectorPool.java:124)
>> at org.apache.manifoldcf.crawlerui.IdleCleanupThread.run(
>> IdleCleanupThread.java:69)
>> And the error was repeated indefinitely in the log.
>>
>> Did it mean that there was a problem with the syncharea folder at some
>> point ?
>>
>> Thanks,
>> Best regards,
>>
>> Olivier TAVARD
>>
>
>
>
> --
> Cihad Güzel
>
>
>


-- 
Luis Cabaceira

Mime
View raw message