manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shigeki Kobayashi <shigeki.kobayas...@g.softbank.co.jp>
Subject Re: Repeated service interruptions
Date Wed, 05 Sep 2012 05:37:33 GMT
Hi Abe-san

I've just faced the same thing as you did, and now having a trouble in
figuring out how to solve this problem.

Did you figure out how to get ride of this problem? If so, it would be nice
if you could share how you did it.


Regards,

Shigeki

2012/8/2 Shinichiro Abe <shinichiro.abe.1@gmail.com>

> Thanks very much for the help!
> I understand.
> Shinichiro Abe
>
> On 2012/08/01, at 19:35, Karl Wright wrote:
>
> > On Wed, Aug 1, 2012 at 5:48 AM, Shinichiro Abe
> > <shinichiro.abe.1@gmail.com> wrote:
> >> Hi Karl,
> >>
> >> I still have a problem.
> >> I reduced maximum number of connections into 2.
> >> I rebooted the file server, not domain controller.
> >> When I configured the paths[1], the log said no error
> >> and ShareDrive connector crawled the files successfully.
> >> When I made the path's config default(matching * ),
> >> the log said "all pipe instances are busy" error.
> >> Both of path's config pointed the same location.
> >>
> >> Also when this error occurred, watching the log of ingest,
> >> HttpPoster was waiting for response stream
> >> and couldn't get response from Solr,
> >> and threw SocketTimeoutException.
> >> I increased jcifs.smb.client.responseTimeout
> >> but still threw the exception.
> >> On Solr, Jetty threw SocketException(socket wr
> >> ite error).
> >> I'm working on checking Solr logs.
> >> Solr may do something wrong when running /update/extract.
> >>
> >
> > If Solr threw the exception this sounds likely.
> >
> >> Do you know something like this?
> >> Does path's matching config affect those errors?
> >>
> >> [1]Paths Tab:
> >> Include  directory(s)  matching  /01*
> >>
> >
> > This should have nothing to do with socket exceptions, except possibly
> > that the crawler winds up trying to read a file that isn't actually a
> > file but is something else, like a named pipe or something.  This
> > typically doesn't happen if the server is a Windows machine but if it
> > is a Samba server I could imagine something like that happening.
> >
> > Karl
> >
> >> P.S.
> >> Thank you for fix CONNECTORS-494.
> >> I checked trunk code, worked well.
> >>
> >> Thank you,
> >> Shinichiro Abe
> >>
> >> On 2012/07/24, at 22:13, Karl Wright wrote:
> >>
> >>> Hi Abe-san,
> >>>
> >>> Did you figure out what the problem was?
> >>>
> >>> Karl
> >>>
> >>> On Thu, Jul 19, 2012 at 5:52 AM, Karl Wright <daddywri@gmail.com>
> wrote:
> >>>> Hi Abe-san,
> >>>>
> >>>> Sometimes what looks like a server error can actually be due to the
> >>>> domain controller.  I wonder if the domain controller needs to be
> >>>> rebooted?
> >>>>
> >>>> Karl
> >>>>
> >>>> On Thu, Jul 19, 2012 at 5:12 AM, Shinichiro Abe
> >>>> <shinichiro.abe.1@gmail.com> wrote:
> >>>>> Hi Karl,
> >>>>> Thank you for the reply.
> >>>>> I tried to reduce maximum number of connections from 10
> >>>>> to 5, but didn't  avoid busy error. I'll try to reduce more.
> >>>>> Thank you.
> >>>>> Shinichiro Abe
> >>>>>
> >>>>> On 2012/07/19, at 15:55, Karl Wright wrote:
> >>>>>
> >>>>>> Hi Abe-san,
> >>>>>>
> >>>>>> The "all pipe instances are busy" error is coming from the Windows
> >>>>>> server you are trying to crawl.  I don't know what is happening
> there
> >>>>>> but here are some possibilities:
> >>>>>>
> >>>>>> (1) The Windows server is just overloaded; you can try reducing
the
> >>>>>> maximum number of connections to 2 or 3 to see if that helps.
> >>>>>> (2) The Windows server needs rebooting.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Karl
> >>>>>>
> >>>>>> On Wed, Jul 18, 2012 at 10:09 PM, Shinichiro Abe
> >>>>>> <shinichiro.abe.1@gmail.com> wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> I use windows shares connector and ran a job.
> >>>>>>> The job was aborted without done normally and the job's
status
> said:
> >>>>>>> Error: Repeated service interruptions - failure processing
> document: Read timed out
> >>>>>>>
> >>>>>>> Why was the job aborted? I use ManifoldCF 0.5.1 and the
latest
> version's jcifs.jar.
> >>>>>>> Is the crawled server busy? I think the server MCF is installed
> seems not to be busy,
> >>>>>>> the other servers in which MCF will crawls seem to be busy.
> >>>>>>> How can I run the job without error? What's wrong?
> >>>>>>>
> >>>>>>>
> >>>>>>> the logs of connector:
> >>>>>>>
> >>>>>>> WARN 2012-07-12 16:28:52,648 (Worker thread '19') - JCIFS:
> Possibly transient exception detected on attempt 1 while getting share
> security: All pipe instances are busy.
> >>>>>>>      at jcifs.smb.SmbTransport.checkStatus(SmbTransport.java:563)
> >>>>>>>      at jcifs.smb.SmbTransport.send(SmbTransport.java:663)
> >>>>>>> ..
> >>>>>>> WARN 2012-07-12 16:36:37,585 (Worker thread '19') - JCIFS:
> Possibly transient exception detected on attempt 3 while getting share
> security: All pipe instances are busy.
> >>>>>>> ..
> >>>>>>> WARN 2012-07-12 16:36:37,585 (Worker thread '19') - JCIFS:
'Busy'
> response when getting document version for
> smb://XX.XX.XX.XX/D$/abcde/1234/123456789/e123456789a.pdf: retrying...
> >>>>>>> ..
> >>>>>>> WARN 2012-07-12 16:36:37,585 (Worker thread '19') - Pre-ingest
> service interruption reported for job 1342076182624 connection 'Windows
> shares': Timeout or other service interruption: All pipe instances are busy.
> >>>>>>> ..
> >>>>>>> WARN 2012-07-12 19:14:30,335 (Worker thread '19') - Service
> interruption reported for job 1342076182624 connection 'Windows shares':
> Ingestion API socket timeout exception waiting for response code: Read
> timed out; ingestion will be retried again later
> >>>>>>> ..
> >>>>>>> WARN 2012-07-12 20:43:50,210 (Worker thread '19') - Service
> interruption reported for job 1342076182624 connection 'Windows shares':
> Ingestion API socket timeout exception waiting for response code: Read
> timed out; ingestion will be retried again later
> >>>>>>> ..
> >>>>>>> ERROR 2012-07-12 20:43:50,210 (Worker thread '19') - Exception
> tossed: Repeated service interruptions - failure processing document: Read
> timed out
> >>>>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException:
> Repeated service interruptions - failure processing document: Read timed out
> >>>>>>>      at
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:606)
> >>>>>>> Caused by: java.net.SocketTimeoutException: Read timed out
> >>>>>>>      at java.net.SocketInputStream.socketRead0(Native Method)
> >>>>>>>      at java.net.SocketInputStream.read(Unknown Source)
> >>>>>>>      at java.net.SocketInputStream.read(Unknown Source)
> >>>>>>>      at
> org.apache.manifoldcf.agents.output.solr.HttpPoster.readLine(HttpPoster.java:571)
> >>>>>>>      at
> org.apache.manifoldcf.agents.output.solr.HttpPoster.getResponse(HttpPoster.java:598)
> >>>>>>>
> >>>>>>> Thanks in advance,
> >>>>>>> Shinichiro Abe
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>
> >>
>
>

Mime
View raw message