hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleg Kalnichevski <ol...@apache.org>
Subject URI impl in HttpClient 4.0 was Re: [jira] Commented: (HTTPCLIENT-587) derelativizing of relative URIs with a scheme is incorrect
Date Sat, 17 Jun 2006 17:43:10 GMT
On Sat, 2006-06-17 at 19:21 +0200, Roland Weber wrote:
> Hi all,
> If other people feel like Gordon, we should consider spinning of the URI class
> into a separate jakarta-commons subproject that exclusively deals with URIs.
> Or into a "Jakarta Network Component", depending on how things discussed on
> the Jakarta general mailing list evolve.

This has been tried before. The original author of URI code attempted to
have it spun off into a project of its own (commons-uri) shortly before
leaving HttpClient project for good [1]. Somehow the idea failed to
generate a lot of support. 

If people are interested in keeping URI code evolving and being a part
of HttpClient 4.0 new contributors must come forth and help us rewrite
and later maintain the code.



> cheers,
>   Roland
> Gordon Mohr (JIRA) wrote:
> >     [ http://issues.apache.org/jira/browse/HTTPCLIENT-587?page=comments#action_12416592
> > 
> > Gordon Mohr commented on HTTPCLIENT-587:
> > ----------------------------------------
> > 
> > 
> >>What's wrong with the JDK URI class?
> > 
> > 
> > (a) It still has bugs where it fails to implement the spec at well as httpclient.URI.
One recent example, still a problem in current JDK 1.6 betas:
> > 
> > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4708535
> > 
> > java.net.URI base = new java.net.URI("http://www.example.com/some/page");
> > java.net.URI rel = new java.net.URI("");
> > java.net.URI derel = base.resolve(rel);
> > derel.toString();
> > (java.lang.String) http://www.example.com/some/   // INCORRECT
> > 
> > org.apache.commons.httpclient.URI base = new org.apache.commons.httpclient.URI("http://www.example.com/some/page");
> > org.apache.commons.httpclient.URI rel = new org.apache.commons.httpclient.URI("");
> > org.apache.commons.httpclient.URI derel = new org.apache.commons.httpclient.URI(base,rel);
> > derel.toString();
> > (java.lang.String) http://www.example.com/some/page  // CORRECT
> > 
> > (b) java.net.URI and its maintainers reject the idea that there should be any facility
in the URI class for tolerating the same sorts of formal spec deviations often seen in real
URIs and domain names. 
> > 
> > As one example, domain names with '_' are technically illegal but have often been
tolerated by DNS-related software and we have run across functioning websites at subdomains
with '_' in their name. Browsers browse these sites fine, so we want to be able to crawl them.
java.net.URI can't help us.
> > 
> > Now of course, it's legitimate and useful to provide a class which regirously implements
all written standards. Not everyone wants a class that also tolerates de facto practices.
But that leads us to the ultimate problem with java.net.URI:
> > 
> > (c) java.net.URI licensing and language declarations make it resistant to reuse
and adaptation to other legitimate uses
> > 
> > It's not open source and major portions of its implementation are 'private' or 'final'.
So it's impossible to reuse 99% of it (such as its various RFC syntax character-class definitions,
fields, and working parsing code) while also either  patching the bugs like in (a) above or
overriding the strictness which makes it unsuitable for some purposes like in (b) above. 
> > 
> > In comparison, the org.apache.commons.httpclient.URI class is friendly to subclassing
(which we've used to work around bugs and change the behavior to better fit our problem domain)
and if that didn't work ith respect to a bug, we'd at least have the option of patching it
ourselves and redistributing the fix. 
> > 
> > So our project would very much miss the pretty-good (and at least serviceable when
broken) httpclient.URI class if it were dropped in favor of the JDK java.net.URI class. 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org

To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org

View raw message