manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Job error during WindowsShare repository connector indexation
Date Wed, 11 Oct 2017 11:12:23 GMT
This error:

>>>>>>
WARN 2017-10-09 08:23:56,284 (Idle cleanup thread) -
MCF|MCF-agent|apache.manifoldcf.lock|Attempt
to set file lock 'mcf/mcf_home/./syncharea/551/442/lock-_POOLTARGET__
REPOSITORYCONNECTORPOOL_SmbFileShare.lock' failed: No such file or directory
java.io.IOException: No such file or directory
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createNewFile(File.java:1012)
at org.apache.manifoldcf.core.lockmanager.FileLockObject.
grabFileLock(FileLockObject.java:223)
at org.apache.manifoldcf.core.lockmanager.FileLockObject.
obtainGlobalWriteLockNoWait(FileLockObject.java:78)
at org.apache.manifoldcf.core.lockmanager.LockObject.obtainGlobalWriteLock(
LockObject.java:121)
at org.apache.manifoldcf.core.lockmanager.LockObject.
enterWriteLock(LockObject.java:74)
at org.apache.manifoldcf.core.lockmanager.LockGate.
enterWriteLock(LockGate.java:177)
at org.apache.manifoldcf.core.lockmanager.BaseLockManager.
enterWrite(BaseLockManager.java:1120)
at org.apache.manifoldcf.core.lockmanager.BaseLockManager.enterWriteLock(
BaseLockManager.java:757)
at org.apache.manifoldcf.core.lockmanager.LockManager.
enterWriteLock(LockManager.java:302)
at org.apache.manifoldcf.core.connectorpool.ConnectorPool$
Pool.pollAll(ConnectorPool.java:585)
at org.apache.manifoldcf.core.connectorpool.ConnectorPool.pollAllConnectors(
ConnectorPool.java:338)
at org.apache.manifoldcf.crawler.repositoryconnectorpool.
RepositoryConnectorPool.pollAllConnectors(RepositoryConnectorPool.java:124)
at org.apache.manifoldcf.crawlerui.IdleCleanupThread.
run(IdleCleanupThread.java:69)
And the error was repeated indefinitely in the log.
<<<<<<

is due to somebody erasing the file-based syncharea while ManifoldCF
processes were active.  We strongly suggest using Zookeeper rather than
file-based synch, in any case.

Thanks,

Karl


On Wed, Oct 11, 2017 at 6:05 AM, Luis Cabaceira <cabaceira@gmail.com> wrote:

> From the look of it, this can be coming from a limitation on the number
> file handles. You process can be creating too many file handles and not
> closing those in time, eventually preventing further file operations.
>
> I suggest you check this, in Linux run : cat /proc/sys/fs/file-max
>
>
> To see the hard and soft values :
>
> # ulimit -Hn
> # ulimit -Sn
>
> P.S. - Change into the user that is running Manifold first
>
>
> On 11 October 2017 at 13:54, Olivier Tavard <olivier.tavard@francelabs.com
> > wrote:
>
>> Hi,
>>
>> Thanks for your answer.
>> Yes I could reach the samba server from the MCF server. Indeed, the first
>> hours after the MCF job was launched, thousands of documents were correctly
>> accessed and processed by MCF. The mentioned errors appeared only after few
>> hours. Before that, the indexation was done correctly.
>>
>> Best regards,
>> Olivier TAVARD
>>
>>
>> Le 11 oct. 2017 à 11:21, Cihad Guzel <cguzelg@gmail.com> a écrit :
>>
>> Hi Olivier,
>>
>> Did you try to connect to samba server with any samba client app? Check
>> Iptables on your server. Can you stop iptables on ubuntu server? Maybe, you
>> can configure iptables.
>>
>> Regards,
>> Cihad Guzel
>>
>>
>> 2017-10-11 12:02 GMT+03:00 Olivier Tavard <olivier.tavard@francelabs.com>
>> :
>>
>>> Hi,
>>>
>>> I had this error during crawling a Samba hosted on Ubuntu Server :
>>> ERROR 2017-10-05 00:00:14,109 (Idle cleanup thread) -
>>> MCF|MCF-agent|apache.manifoldcf.crawlerthreads|Exception tossed:
>>> Service '_ANON_0' of type '_REPOSITORYCONNECTORPOOL_SmbFileShare' is
>>> not active
>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Service
>>> '_ANON_0' of type '_REPOSITORYCONNECTORPOOL_SmbFileShare' is not active
>>> at org.apache.manifoldcf.core.lockmanager.BaseLockManager.updat
>>> eServiceData(BaseLockManager.java:273)
>>> at org.apache.manifoldcf.core.lockmanager.LockManager.updateSer
>>> viceData(LockManager.java:108)
>>> at org.apache.manifoldcf.core.connectorpool.ConnectorPool$Pool.
>>> pollAll(ConnectorPool.java:654)
>>> at org.apache.manifoldcf.core.connectorpool.ConnectorPool.pollA
>>> llConnectors(ConnectorPool.java:338)
>>> at org.apache.manifoldcf.crawler.repositoryconnectorpool.Reposi
>>> toryConnectorPool.pollAllConnectors(RepositoryConnectorPool.java:124)
>>> at org.apache.manifoldcf.crawler.system.IdleCleanupThread.run(I
>>> dleCleanupThread.java:68)
>>>
>>> I used MCF 2.8.1 on Debian 8 with Postgresql 9.5.3, Windows Share
>>> repository connector. The job was configured to process about 2 millions of
>>> files  (600 GB).
>>> For text extraction I used a Tika server (on the same server as MCF) and
>>> add the Tika external content extractor transformation connector into the
>>> job configuration.
>>> The error was present 9 hours after the job was launched. The status job
>>> still indicated that the job was running but there was only 1 document in
>>> the active column and the error above was repeated in the MCF log.
>>>
>>> Then I tried to launch the clean-lock.sh script and I obtained this
>>> error :
>>> WARN 2017-10-09 08:23:56,284 (Idle cleanup thread) -
>>> MCF|MCF-agent|apache.manifoldcf.lock|Attempt to set file lock
>>> 'mcf/mcf_home/./syncharea/551/442/lock-_POOLTARGET__REPOSITO
>>> RYCONNECTORPOOL_SmbFileShare.lock' failed: No such file or directory
>>> java.io.IOException: No such file or directory
>>> at java.io.UnixFileSystem.createFileExclusively(Native Method)
>>> at java.io.File.createNewFile(File.java:1012)
>>> at org.apache.manifoldcf.core.lockmanager.FileLockObject.grabFi
>>> leLock(FileLockObject.java:223)
>>> at org.apache.manifoldcf.core.lockmanager.FileLockObject.obtain
>>> GlobalWriteLockNoWait(FileLockObject.java:78)
>>> at org.apache.manifoldcf.core.lockmanager.LockObject.obtainGlob
>>> alWriteLock(LockObject.java:121)
>>> at org.apache.manifoldcf.core.lockmanager.LockObject.enterWrite
>>> Lock(LockObject.java:74)
>>> at org.apache.manifoldcf.core.lockmanager.LockGate.enterWriteLo
>>> ck(LockGate.java:177)
>>> at org.apache.manifoldcf.core.lockmanager.BaseLockManager.enter
>>> Write(BaseLockManager.java:1120)
>>> at org.apache.manifoldcf.core.lockmanager.BaseLockManager.enter
>>> WriteLock(BaseLockManager.java:757)
>>> at org.apache.manifoldcf.core.lockmanager.LockManager.enterWrit
>>> eLock(LockManager.java:302)
>>> at org.apache.manifoldcf.core.connectorpool.ConnectorPool$Pool.
>>> pollAll(ConnectorPool.java:585)
>>> at org.apache.manifoldcf.core.connectorpool.ConnectorPool.pollA
>>> llConnectors(ConnectorPool.java:338)
>>> at org.apache.manifoldcf.crawler.repositoryconnectorpool.Reposi
>>> toryConnectorPool.pollAllConnectors(RepositoryConnectorPool.java:124)
>>> at org.apache.manifoldcf.crawlerui.IdleCleanupThread.run(IdleCl
>>> eanupThread.java:69)
>>> And the error was repeated indefinitely in the log.
>>>
>>> Did it mean that there was a problem with the syncharea folder at some
>>> point ?
>>>
>>> Thanks,
>>> Best regards,
>>>
>>> Olivier TAVARD
>>>
>>
>>
>>
>> --
>> Cihad Güzel
>>
>>
>>
>
>
> --
> Luis Cabaceira
>

Mime
View raw message