manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Error with webcrawler
Date Tue, 04 Jun 2013 17:07:09 GMT
Hi Stephane,

I just committed a change that may well fix this: r1489521.  Please synch
up and let me know.  If it doesn't, I will be happy to disable the feature
until I have a fix.

Karl



On Tue, Jun 4, 2013 at 1:02 PM, Karl Wright <daddywri@gmail.com> wrote:

> I'm pretty sure this is related to changes that were made for
> CONNECTORS-693.  If I can't get any further shortly, I will disable those
> changes until I can figure out what is wrong.
>
> Karl
>
>
>
> On Tue, Jun 4, 2013 at 1:00 PM, Stephane Gamard <stephane@gamard.net>wrote:
>
>> Hi Karl,
>>
>> I've looked into the simpleHistory, and unfortunately the message is the
>> same as in the log:
>>
>>  06-04-2013 18:55:12.876fetch http://wiki.apache.org/solr/
>> -1042 765Interrupted: IO exception reading response stream:
>> org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher$ThrottledInputstream.read()
>> returned value out of range -1..255: -117
>>
>>
>> On Tue, Jun 4, 2013 at 6:54 PM, Karl Wright <daddywri@gmail.com> wrote:
>>
>>> Hi Stephane,
>>>
>>> I'll look into the problem, but it would be great if you could have a
>>> look at the Simple History and tell me if you see a stack trace there.
>>> I've not seen this issue before and having a line number would be really
>>> helpful.
>>>
>>> Karl
>>>
>>>
>>> On Tue, Jun 4, 2013 at 12:45 PM, Stephane Gamard <stephane@gamard.net>wrote:
>>>
>>>> Hi All,
>>>>
>>>>
>>>> Just checked out the trunk to test CONNECTORS-700 and glad it works
>>>> (thanks a whole bunch!). Just wondering about a new but I have. The
>>>> previously running web crawler is now broken. I've dropped it and created
a
>>>> new one and I have the following error:
>>>>
>>>>
>>>>  WARN 2013-06-04 18:40:08,393 (Worker thread '1') - Pre-ingest service
>>>> interruption reported for job 1370363902673 connection
>>>> 'default-web-repository': IO exception reading response stream:
>>>> org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher$ThrottledInputstream.read()
>>>> returned value out of range -1..255: -117
>>>>
>>>>
>>>> Then the job stays at that status:
>>>>  **Restart**<http://781fa7e3fa00bc3b270c3411ed3bd1da.searchbox.com/mcf/showjobstatus.jsp>
>>>>   **Restart minimal**<http://781fa7e3fa00bc3b270c3411ed3bd1da.searchbox.com/mcf/showjobstatus.jsp>
>>>>   **Pause**<http://781fa7e3fa00bc3b270c3411ed3bd1da.searchbox.com/mcf/showjobstatus.jsp>
>>>>   **Abort**<http://781fa7e3fa00bc3b270c3411ed3bd1da.searchbox.com/mcf/showjobstatus.jsp>
>>>>   wiki-documentationRunning Tue Jun 04 18:40:04 CEST 20131 11
>>>>
>>>>
>>>> Any idea about why?
>>>>
>>>> Attached is the full log
>>>>
>>>> _Stephane
>>>>
>>>
>>>
>>
>

Mime
View raw message