river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Firmstone <j...@zeus.net.au>
Subject PreferredClassProvider: URL and URI
Date Mon, 06 Aug 2012 12:38:38 GMT
It turns out the recent failures on Solaris x64 hudson are due to an 
illegal character in the URI host name string:

Testcase: 
testCrossPlatformNormalise(org.apache.river.impl.net.UriStringTest):    
Caused an ERROR
Illegal character in hostname at index 13: 
http://hudson_solaris:9081/nonactivatablegroup-dl.jar
java.net.URISyntaxException: Illegal character in hostname at index 13: 
http://hudson_solaris:9081/nonactivatablegroup-dl.jar

The hostname is incorrectly parsed only as an authority component when 
passed to the constructor URI(String str), leading to the errors seen in 
failing hudson tests.

So the good news is, it isn't a problem with Solaris x64.

But it does raise some important questions.

But first some background...

The summary of semantic changes:

    * Previous releases of Jini & River PreferredClassProvider have
      relied upon URL and the calling threads context ClassLoader
      (called the parent loader) to determine the correct
      PreferredClassLoader.  Basically URL resolves to an IP address, so
      the ClassLoader was determined by the IP address of the codebase
      and the calling threads context ClassLoader.
    * We could use URI and the calling Threads context ClassLoader
      instead.  This means that the PreferredClassLoader would be
      determined by the normalised form of the URI and the context
      ClassLoader.

The nitty gritty:

    * Relying on the IP address probably made sense in the 90's, today
      there are issues with virtual hosts, dynamic ip addresses and
      maintaining a fixed IP address over time and failover codebase
      replication so that the codebase always appears as the same IP
      address to clients.   You might also imagine that if Jini / River
      hits the internet, that NAT and routing would cause some big
      problems.  This doesn't mean we couldn't continue to use URL and
      provide a URL Handler for some new protocol that solves these issues.
    * Changing to URI brings some big benefits, but it comes at a
      price.  The benefits are an added layer of indirection, a
      documented standard that can be used to predict ClassLoader
      selection reliably, regardless of protocol.  Cheaper code base
      replication, backup and regional redirection and hosting of
      codebases.  (Obviously signing jar files will be an important step
      to prevent unwanted codebase mixing -eg a remote codebase attack
      with DNS posioning).  Dynamic IP addresses, virtual hosts and fail
      over hosting will work too.  What's the price you may ask?  For
      proper comparison, URI's must have a strictly restricted character
      set and be normalised EG: legal but escaped characters must be
      unescaped, the scheme must be in lower case, the host must also be
      in lower case, the path is case sensitive (file URL paths on
      Windows must be converted to upper case).  Only after
      normalisation is complete can we accurately call hashCode() and
      equals() on URI instances, since this eliminates false negatives.

        * Avoid using the underscore (_) character in machine names.
          Internet standards dictate that domain names conform to the
          host name requirements described in Internet Official Protocol
          Standards RFC 952 and RFC 1123. Domain names must contain only
          letters (upper or lower case) and digits. Domain names can
          also contain dash characters ( - ) as long as the dashes are
          not on the ends of the name. Underscore characters ( _ ) are
          not supported in the host name.

    * The huge benefit is now we can perform URI string based
      comparison, without relying on a URL Handlers for identity, so
      future URL Handlers will also have the same expected behaviour and
      must conform to standards.  This will also have huge performance
      benefits, no longer will class resolution block on DNS calls, nor
      will equals and hashCode comparison rely on DNS resolution.

Currently non conforming URL's can be loaded, such as 
"http://hudson_solaris:9081/nonactivatablegroup-dl.jar", these will no 
longer be supportable with URI.

What do you say?  Do I revert to tried and tested URL (the devil we 
know) or do I report a bug against the use of an underscore in the 
hudson_solaris hostname and continue refining the use of URI?

Regards,

Peter.






Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message