manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Repeated service interruptions
Date Wed, 01 Aug 2012 10:35:36 GMT
On Wed, Aug 1, 2012 at 5:48 AM, Shinichiro Abe
<shinichiro.abe.1@gmail.com> wrote:
> Hi Karl,
>
> I still have a problem.
> I reduced maximum number of connections into 2.
> I rebooted the file server, not domain controller.
> When I configured the paths[1], the log said no error
> and ShareDrive connector crawled the files successfully.
> When I made the path's config default(matching * ),
> the log said "all pipe instances are busy" error.
> Both of path's config pointed the same location.
>
> Also when this error occurred, watching the log of ingest,
> HttpPoster was waiting for response stream
> and couldn't get response from Solr,
> and threw SocketTimeoutException.
> I increased jcifs.smb.client.responseTimeout
> but still threw the exception.
> On Solr, Jetty threw SocketException(socket wr
> ite error).
> I'm working on checking Solr logs.
> Solr may do something wrong when running /update/extract.
>

If Solr threw the exception this sounds likely.

> Do you know something like this?
> Does path's matching config affect those errors?
>
> [1]Paths Tab:
> Include  directory(s)  matching  /01*
>

This should have nothing to do with socket exceptions, except possibly
that the crawler winds up trying to read a file that isn't actually a
file but is something else, like a named pipe or something.  This
typically doesn't happen if the server is a Windows machine but if it
is a Samba server I could imagine something like that happening.

Karl

> P.S.
> Thank you for fix CONNECTORS-494.
> I checked trunk code, worked well.
>
> Thank you,
> Shinichiro Abe
>
> On 2012/07/24, at 22:13, Karl Wright wrote:
>
>> Hi Abe-san,
>>
>> Did you figure out what the problem was?
>>
>> Karl
>>
>> On Thu, Jul 19, 2012 at 5:52 AM, Karl Wright <daddywri@gmail.com> wrote:
>>> Hi Abe-san,
>>>
>>> Sometimes what looks like a server error can actually be due to the
>>> domain controller.  I wonder if the domain controller needs to be
>>> rebooted?
>>>
>>> Karl
>>>
>>> On Thu, Jul 19, 2012 at 5:12 AM, Shinichiro Abe
>>> <shinichiro.abe.1@gmail.com> wrote:
>>>> Hi Karl,
>>>> Thank you for the reply.
>>>> I tried to reduce maximum number of connections from 10
>>>> to 5, but didn't  avoid busy error. I'll try to reduce more.
>>>> Thank you.
>>>> Shinichiro Abe
>>>>
>>>> On 2012/07/19, at 15:55, Karl Wright wrote:
>>>>
>>>>> Hi Abe-san,
>>>>>
>>>>> The "all pipe instances are busy" error is coming from the Windows
>>>>> server you are trying to crawl.  I don't know what is happening there
>>>>> but here are some possibilities:
>>>>>
>>>>> (1) The Windows server is just overloaded; you can try reducing the
>>>>> maximum number of connections to 2 or 3 to see if that helps.
>>>>> (2) The Windows server needs rebooting.
>>>>>
>>>>> Thanks,
>>>>> Karl
>>>>>
>>>>> On Wed, Jul 18, 2012 at 10:09 PM, Shinichiro Abe
>>>>> <shinichiro.abe.1@gmail.com> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I use windows shares connector and ran a job.
>>>>>> The job was aborted without done normally and the job's status said:
>>>>>> Error: Repeated service interruptions - failure processing document:
Read timed out
>>>>>>
>>>>>> Why was the job aborted? I use ManifoldCF 0.5.1 and the latest version's
jcifs.jar.
>>>>>> Is the crawled server busy? I think the server MCF is installed seems
not to be busy,
>>>>>> the other servers in which MCF will crawls seem to be busy.
>>>>>> How can I run the job without error? What's wrong?
>>>>>>
>>>>>>
>>>>>> the logs of connector:
>>>>>>
>>>>>> WARN 2012-07-12 16:28:52,648 (Worker thread '19') - JCIFS: Possibly
transient exception detected on attempt 1 while getting share security: All pipe instances
are busy.
>>>>>>       at jcifs.smb.SmbTransport.checkStatus(SmbTransport.java:563)
>>>>>>       at jcifs.smb.SmbTransport.send(SmbTransport.java:663)
>>>>>> ..
>>>>>> WARN 2012-07-12 16:36:37,585 (Worker thread '19') - JCIFS: Possibly
transient exception detected on attempt 3 while getting share security: All pipe instances
are busy.
>>>>>> ..
>>>>>> WARN 2012-07-12 16:36:37,585 (Worker thread '19') - JCIFS: 'Busy'
response when getting document version for smb://XX.XX.XX.XX/D$/abcde/1234/123456789/e123456789a.pdf:
retrying...
>>>>>> ..
>>>>>> WARN 2012-07-12 16:36:37,585 (Worker thread '19') - Pre-ingest service
interruption reported for job 1342076182624 connection 'Windows shares': Timeout or other
service interruption: All pipe instances are busy.
>>>>>> ..
>>>>>> WARN 2012-07-12 19:14:30,335 (Worker thread '19') - Service interruption
reported for job 1342076182624 connection 'Windows shares': Ingestion API socket timeout exception
waiting for response code: Read timed out; ingestion will be retried again later
>>>>>> ..
>>>>>> WARN 2012-07-12 20:43:50,210 (Worker thread '19') - Service interruption
reported for job 1342076182624 connection 'Windows shares': Ingestion API socket timeout exception
waiting for response code: Read timed out; ingestion will be retried again later
>>>>>> ..
>>>>>> ERROR 2012-07-12 20:43:50,210 (Worker thread '19') - Exception tossed:
Repeated service interruptions - failure processing document: Read timed out
>>>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated
service interruptions - failure processing document: Read timed out
>>>>>>       at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:606)
>>>>>> Caused by: java.net.SocketTimeoutException: Read timed out
>>>>>>       at java.net.SocketInputStream.socketRead0(Native Method)
>>>>>>       at java.net.SocketInputStream.read(Unknown Source)
>>>>>>       at java.net.SocketInputStream.read(Unknown Source)
>>>>>>       at org.apache.manifoldcf.agents.output.solr.HttpPoster.readLine(HttpPoster.java:571)
>>>>>>       at org.apache.manifoldcf.agents.output.solr.HttpPoster.getResponse(HttpPoster.java:598)
>>>>>>
>>>>>> Thanks in advance,
>>>>>> Shinichiro Abe
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>
>

Mime
View raw message