manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CONNECTORS-1154) Web connector: Better diagnostic output needed in the case of undifferentiated IO exceptions
Date Fri, 30 Jan 2015 01:19:34 GMT

    [ https://issues.apache.org/jira/browse/CONNECTORS-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14298023#comment-14298023
] 

Karl Wright commented on CONNECTORS-1154:
-----------------------------------------

Here's the full exception:
{code}
INFO 2015-01-29 16:01:24,973 (Worker thread '20') - WEB: FETCH URL|http://127.0.0.1/kpf/internal/contentslist|1422514884965+6|-104|0|org.apache.manifoldcf.core.interfaces.ManifoldCFException|
Interrupted: IO exception reading response stream: null
DEBUG 2015-01-29 16:01:24,973 (Worker thread '20') - WEB: Fetch exception for 'http://127.0.0.1/kpf/internal/contentslist'
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Interrupted: IO exception reading
response stream: null
        at org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher$ThrottledConnection.noteInterrupted(ThrottledFetcher.java:1877)
        at org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.getDocumentVersions(WebcrawlerConnector.java:799)
        at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:322)
Caused by: org.apache.manifoldcf.agents.interfaces.ServiceInterruption: IO exception reading
response stream: null
        at org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher$ThrottledConnection.handleIOException(ThrottledFetcher.java:2044)
        at org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher$ThrottledConnection.getResponseBodyStream(ThrottledFetcher.java:1831)
        at org.apache.manifoldcf.crawler.connectors.webcrawler.DataCache.addData(DataCache.java:75)
        at org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.getDocumentVersions(WebcrawlerConnector.java:747)
        ... 1 more
Caused by: java.io.EOFException
        at java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:264)
        at java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:254)
        at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:163)
        at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:78)
        at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:90)
        at org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher$ExecuteMethodThread.run(ThrottledFetcher.java:2428)
 WARN 2015-01-29 16:01:24,974 (Worker thread '20') - Pre-ingest service interruption reported
for job 1422428631249 connection 'kpf_all_dev': IO exception reading response stream: null
{code}


> Web connector: Better diagnostic output needed in the case of undifferentiated IO exceptions
> --------------------------------------------------------------------------------------------
>
>                 Key: CONNECTORS-1154
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1154
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Web connector
>    Affects Versions: ManifoldCF 1.8, ManifoldCF 2.0
>            Reporter: Karl Wright
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 1.9, ManifoldCF 2.1
>
>         Attachments: CONNECTORS-1154.patch
>
>
> IOExceptions of known kind are caught and individually dealt with.  But undifferentiated
IO exceptions don't give you much diagnostic information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message