hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Roland Weber (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HTTPCLIENT-679) URI Absolutization does not follow browser behavior
Date Sat, 11 Aug 2007 12:43:43 GMT

     [ https://issues.apache.org/jira/browse/HTTPCLIENT-679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Roland Weber updated HTTPCLIENT-679:
------------------------------------

    Fix Version/s: 3.1 Final

> URI Absolutization does not follow browser behavior
> ---------------------------------------------------
>
>                 Key: HTTPCLIENT-679
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-679
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>          Components: HttpClient
>    Affects Versions: 3.1 RC1
>         Environment: HttpClient 3.1 RC1, 
> JDK 1.6.0
> Ubuntu 7.04
>            Reporter: Jeff Dalton
>             Fix For: 3.1 Final
>
>         Attachments: uri_fix.patch
>
>
> This was encountered using Heritrix to crawl a prominent website.
> The URI resulting from the HttpClient URI constructor (base, relative) does not follow
browser behavior:
> URI newUrl = new URI(new URI("http://www.theirwebsite.com/browse/results?type=browse&att=1"),
"?sort=0&offset=11&pageSize=10")
> Results in newUrl:
> http://www.theirwebsite.com/browse/?sort=0&offset=11&pageSize=10
> The desired behavior based on Firefox and IE should be:
> http://www.theirwebsite.com/browse/results?sort=0&offset=11&pageSize=10
> These browsers treat the question mark similar to a directory separator and do not require
a file to be specified before the query.
> HttpClient's current behavior does not correspond to current browser behavior and leads
to an inability to crawl certain websites if HttpClient's URI class is used.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: httpcomponents-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpcomponents-dev-help@jakarta.apache.org


Mime
View raw message