hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Oleg Kalnichevski (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (HTTPCLIENT-679) URI Absolutization does not follow browser behavior
Date Sun, 05 Aug 2007 12:20:52 GMT

     [ https://issues.apache.org/jira/browse/HTTPCLIENT-679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Oleg Kalnichevski resolved HTTPCLIENT-679.
------------------------------------------

    Resolution: Won't Fix

Jeff, Gordon,

URI class in HttpClient 3.x is a complete mess none of the existing committers would touch
even with a barge pole. This class has been replaced with the standard java.net.URI class
in HttpClient 4.0. If you are prepared to contribute a fix for the problem I'll happily review
it and check it to the repository, but I seriously doubt any of us would be willing to invest
any time into fixing old URI code in HttpClient 3.x. 

Oleg

> URI Absolutization does not follow browser behavior
> ---------------------------------------------------
>
>                 Key: HTTPCLIENT-679
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-679
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>          Components: HttpClient
>    Affects Versions: 3.1 RC1
>         Environment: HttpClient 3.1 RC1, 
> JDK 1.6.0
> Ubuntu 7.04
>            Reporter: Jeff Dalton
>
> This was encountered using Heritrix to crawl a prominent website.
> The URI resulting from the HttpClient URI constructor (base, relative) does not follow
browser behavior:
> URI newUrl = new URI(new URI("http://www.theirwebsite.com/browse/results?type=browse&att=1"),
"?sort=0&offset=11&pageSize=10")
> Results in newUrl:
> http://www.theirwebsite.com/browse/?sort=0&offset=11&pageSize=10
> The desired behavior based on Firefox and IE should be:
> http://www.theirwebsite.com/browse/results?sort=0&offset=11&pageSize=10
> These browsers treat the question mark similar to a directory separator and do not require
a file to be specified before the query.
> HttpClient's current behavior does not correspond to current browser behavior and leads
to an inability to crawl certain websites if HttpClient's URI class is used.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: httpcomponents-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpcomponents-dev-help@jakarta.apache.org


Mime
View raw message