hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Julien Nioche (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HTTPCLIENT-1643) More tolerant handling of unsupported Content-Coding
Date Thu, 23 Apr 2015 10:59:38 GMT

    [ https://issues.apache.org/jira/browse/HTTPCLIENT-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14508836#comment-14508836

Julien Nioche commented on HTTPCLIENT-1643:

Good idea. Will you set the value of 'strict' to false by default?

BTW I'm still discovering the API and am not 100% familiar with how to do things with it.
I've recently started using it in [https://github.com/DigitalPebble/storm-crawler/pull/122]
to get rid of the cumbersome, bug prone, low level http protocol implementation I had borrowed
from Nutch. Will probably contribute it back to Nutch at some point soon.

> More tolerant handling of unsupported Content-Coding
> ----------------------------------------------------
>                 Key: HTTPCLIENT-1643
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1643
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>          Components: HttpClient
>    Affects Versions: 4.4.1
>            Reporter: Julien Nioche
>            Priority: Minor
>             Fix For: 4.5
> The following URL can be fetched by curl 
> {code}
> curl -I http://meta.ats.hrsmart.com/cgi-bin/a/alljobs.cgi
> HTTP/1.1 200 OK
> Date: Wed, 22 Apr 2015 15:49:52 GMT
> Server: Apache/1.3.33 (Debian GNU/Linux) mod_throttle/3.1.2 PHP/4.3.10-15 mod_ssl/2.8.22
> Content-Type: text/html; charset=iso-8859-1
> Content-Encoding: script 
> {code}
> but not by HttpClient as the Content-Encoding value returned by the server is invalid.
This results in a org.apache.http.client.ClientProtocolException being thrown.
> Instead of failing the whole fetch, couldn't you treat an illegal value like this to
be empty or have a default value?

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org

View raw message