hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Oleg Kalnichevski (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (HTTPCLIENT-679) URI Absolutization does not follow browser behavior
Date Sun, 05 Aug 2007 12:20:52 GMT

     [ https://issues.apache.org/jira/browse/HTTPCLIENT-679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Oleg Kalnichevski resolved HTTPCLIENT-679.

    Resolution: Won't Fix

Jeff, Gordon,

URI class in HttpClient 3.x is a complete mess none of the existing committers would touch
even with a barge pole. This class has been replaced with the standard java.net.URI class
in HttpClient 4.0. If you are prepared to contribute a fix for the problem I'll happily review
it and check it to the repository, but I seriously doubt any of us would be willing to invest
any time into fixing old URI code in HttpClient 3.x. 


> URI Absolutization does not follow browser behavior
> ---------------------------------------------------
>                 Key: HTTPCLIENT-679
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-679
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>          Components: HttpClient
>    Affects Versions: 3.1 RC1
>         Environment: HttpClient 3.1 RC1, 
> JDK 1.6.0
> Ubuntu 7.04
>            Reporter: Jeff Dalton
> This was encountered using Heritrix to crawl a prominent website.
> The URI resulting from the HttpClient URI constructor (base, relative) does not follow
browser behavior:
> URI newUrl = new URI(new URI("http://www.theirwebsite.com/browse/results?type=browse&att=1"),
> Results in newUrl:
> http://www.theirwebsite.com/browse/?sort=0&offset=11&pageSize=10
> The desired behavior based on Firefox and IE should be:
> http://www.theirwebsite.com/browse/results?sort=0&offset=11&pageSize=10
> These browsers treat the question mark similar to a directory separator and do not require
a file to be specified before the query.
> HttpClient's current behavior does not correspond to current browser behavior and leads
to an inability to crawl certain websites if HttpClient's URI class is used.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail: httpcomponents-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpcomponents-dev-help@jakarta.apache.org

View raw message