manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Error with webcrawler
Date Tue, 04 Jun 2013 17:02:29 GMT
I'm pretty sure this is related to changes that were made for
CONNECTORS-693.  If I can't get any further shortly, I will disable those
changes until I can figure out what is wrong.

Karl



On Tue, Jun 4, 2013 at 1:00 PM, Stephane Gamard <stephane@gamard.net> wrote:

> Hi Karl,
>
> I've looked into the simpleHistory, and unfortunately the message is the
> same as in the log:
>
>  06-04-2013 18:55:12.876fetch http://wiki.apache.org/solr/
> -1042 765Interrupted: IO exception reading response stream:
> org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher$ThrottledInputstream.read()
> returned value out of range -1..255: -117
>
>
> On Tue, Jun 4, 2013 at 6:54 PM, Karl Wright <daddywri@gmail.com> wrote:
>
>> Hi Stephane,
>>
>> I'll look into the problem, but it would be great if you could have a
>> look at the Simple History and tell me if you see a stack trace there.
>> I've not seen this issue before and having a line number would be really
>> helpful.
>>
>> Karl
>>
>>
>> On Tue, Jun 4, 2013 at 12:45 PM, Stephane Gamard <stephane@gamard.net>wrote:
>>
>>> Hi All,
>>>
>>>
>>> Just checked out the trunk to test CONNECTORS-700 and glad it works
>>> (thanks a whole bunch!). Just wondering about a new but I have. The
>>> previously running web crawler is now broken. I've dropped it and created a
>>> new one and I have the following error:
>>>
>>>
>>>  WARN 2013-06-04 18:40:08,393 (Worker thread '1') - Pre-ingest service
>>> interruption reported for job 1370363902673 connection
>>> 'default-web-repository': IO exception reading response stream:
>>> org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher$ThrottledInputstream.read()
>>> returned value out of range -1..255: -117
>>>
>>>
>>> Then the job stays at that status:
>>>  **Restart**<http://781fa7e3fa00bc3b270c3411ed3bd1da.searchbox.com/mcf/showjobstatus.jsp>
>>>   **Restart minimal**<http://781fa7e3fa00bc3b270c3411ed3bd1da.searchbox.com/mcf/showjobstatus.jsp>
>>>   **Pause**<http://781fa7e3fa00bc3b270c3411ed3bd1da.searchbox.com/mcf/showjobstatus.jsp>
>>>   **Abort**<http://781fa7e3fa00bc3b270c3411ed3bd1da.searchbox.com/mcf/showjobstatus.jsp>
>>>   wiki-documentationRunning Tue Jun 04 18:40:04 CEST 20131 11
>>>
>>>
>>> Any idea about why?
>>>
>>> Attached is the full log
>>>
>>> _Stephane
>>>
>>
>>
>

Mime
View raw message