hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ian Blavins (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HTTPCORE-195) Make it possible to tolerate truncated chunk streams
Date Sun, 23 Sep 2012 22:37:07 GMT

    [ https://issues.apache.org/jira/browse/HTTPCORE-195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461530#comment-13461530

Ian Blavins commented on HTTPCORE-195:


I'm working several layers above the InputStreamReader. The line of code that invokes ChunkedInputStream
(which detects the truncated chunk and throws the exception) is:

        HttpResponse theResponse = httpClient.execute(target, request, context);

The TruncatedChunkException causes theResponse to be null. Other than send the request again
(which would produce the same result) there is nothing I can usefully do. Obviously I can
change the code to work at a lower level but that would defeat the attractiveness of using
HttpComponents in the first place. The only way I can see that the exception would be useful
would be to set an HttpClient parameter that says to treat truncated chunk as end of file.
Then theResponse in the above code would be non-null. Whether it would be valid would depend
on whether the chunk was actually missing data or whether it was complete but smaller than
the (incorrect) chunk length specification (the more likely scenario).

Its not a big issue for me. If the site throws a TruncatedChunkException exception I just
tell my customer they can't have it. Since I fixed my connection close issue I haven't seen
any anyway. The information about the probable underlying cause of the TruncatedChunkException
in the original problem I see as the more useful contribution to the issue. That will solve
more problems than being able to handle TruncatedChunkException at the getResponse level.
> Make it possible to tolerate truncated chunk streams
> ----------------------------------------------------
>                 Key: HTTPCORE-195
>                 URL: https://issues.apache.org/jira/browse/HTTPCORE-195
>             Project: HttpComponents HttpCore
>          Issue Type: Improvement
>          Components: HttpCore NIO
>    Affects Versions: 4.0
>            Reporter: Patrick Moore
>            Priority: Minor
>             Fix For: 4.1-alpha1
>         Attachments: chunkValidationDecoupling.patch, HTTPCORE-195.patch
> Our server is webcrawling.
> We are frequently encountering this issue. We think this might be related to something
on the server that we are scanning. But that doesn't matter. We need to handle such cases
without exceptions. (From my perspective, such things should generate a debug message -- certainly
not an exception that ends processing and throws away the retrieved content! )
> http://stuftpizza.com/ seems to reliably result in this problem
> May be TransferEncoding? http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.6
> Either way we need to be able to deal with issues on the other servers.
> {{{
> Date	Mon, 20 Apr 2009 03:56:45 GMT
> Server	Apache/2.2.3 (Red Hat)
> Accept-Ranges	bytes
> Connection	close
> Transfer-Encoding	chunked
> Content-Type	text/html
> '''Request Headers'''
> Host	stuftpizza.com
> User-Agent	Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv: Gecko/2009032608
> Accept	text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
> Accept-Language	en-us,en;q=0.5
> Accept-Encoding	gzip,deflate
> Accept-Charset	ISO-8859-1,utf-8;q=0.7,*;q=0.7
> Keep-Alive	300
> Connection	keep-alive
> Cookie	
> __utma=47358053.1237981682.1240199754.1240199754.1240199754.1; __utmb=47358053; __utmc=47358053;
> =47358053.1240199754.1.1.utmccn=(direct)|utmcsr=(direct)|utmcmd=(none)
> Cache-Control	max-age=0
> }}}
> {{{
> 20:51:08,768 INFO  [nioEventListener] Request http://stuftpizza.com/ failed with exception.
> org.apache.http.MalformedChunkCodingException: Truncated chunk
> 	at org.apache.http.impl.nio.codecs.ChunkDecoder.read(ChunkDecoder.java:203)
> 	at org.apache.http.nio.util.SimpleInputBuffer.consumeContent(SimpleInputBuffer.java:60)
> 	at org.apache.http.nio.entity.BufferingNHttpEntity.consumeContent(BufferingNHttpEntity.java:72)
> 	at org.apache.http.nio.protocol.AsyncNHttpClientHandler.inputReady(AsyncNHttpClientHandler.java:236)
> 	at org.apache.http.nio.protocol.BufferingHttpClientHandler.inputReady(BufferingHttpClientHandler.java:118)
> 	at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:178)
> 	at org.apache.http.impl.nio.DefaultClientIOEventDispatch.inputReady(DefaultClientIOEventDispatch.java:146)
> 	at com.amplafi.iomanagement.http.UniversalIOEventDispatch.inputReady(UniversalIOEventDispatch.java:133)
> 	at $IOEventDispatch_120c19cd1c7.inputReady($IOEventDispatch_120c19cd1c7.java)
> 	at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:153)
> 	at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:314)
> 	at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:294)
> 	at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:256)
> 	at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:96)
> 	at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:556)
> 	at java.lang.Thread.run(Thread.java:637)
> }}}

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org

View raw message