river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gregg Wonderly <gr...@wonderly.org>
Subject Re: Moving River into the Semantic Web with Codebase Services & Bytecode Analysis services.
Date Tue, 08 Sep 2009 16:26:18 GMT
Peter, I want to write up some questions and thoughts about this post, but can't 
do that right now, hopefully I can in a day or so.

Gregg Wonderly

Peter Firmstone wrote:
> I've had some more thoughts on Codebase services after spending time 
> researching & reflecting.
> 
> Issues I'd like to see addressed or simplified using Codebase services:
> 
>    * Codebase loss
>    * Codebase replication
>    * Codebase upgrades
>    * Codebase configuration
>    * Codebase surrogates, for objects originating from periodically
>      disconnected clients (they also require Refreshable References and
>      Xuid's)
>    * Bytecode Dependency Analysis & API signature identification, for
>      Package & Class Binary Compatiblity & ClassLoader Isolation
>    * Bytecode Static Security Analysis, repackaging & code signing. 
> 
> On the last issue I've had some thoughts about Code bases being able to 
> act as a trust mediator to receive, analyse, repackage, sign and forward 
> bytecode on behalf of clients.  The last two items above fit into the 
> category of Bytecode Analysis service responsibilities for codebases.  
> Prior to loading class files, a client can have a trust relationship 
> with one or more preferred codebase providers.  A code base provider 
> also provides bytecode static analysis services for security and binary 
> compatibility purposes.
> I got thinking about this solution after reading about service proxy 
> circular code verification issues for disconnected clients that project 
> neuromancer exposed.  A surrogate security verifier as well as a 
> codebase surrogate.
> 
> All this would be implemented with minimal changes to services and 
> clients configurations and no change to third party library code, unlike 
> my evolving objects framework proposals.
> 
> After receiving a tip off from Michael Warres, Tim Blackman was gracious 
> enough to share learnings from his research on class loader tree's.  Tim 
> built a prototype system using message digests and was considering 
> implementing textual Class API signatures for identifying compatibility 
> between different class bytecode's.  Tim considered the textual API 
> signatures when he found independent vendor compiler optimisations 
> produced different bytecode, hence different SHA-1 signatures, although 
> they have identical and compatible class API.  I thought about this 
> further and realised that Binary Compatiblity for class files and 
> package change is far more flexible than source code compatibility.  
> While Tim concentrated on API compatibility for ensuring objects that 
> should be shared, could be, he found that groups of class files, based 
> on dependency analysis (this is where the replacement ClassDep code came 
> from), required their own ClassLoader's, hence there are a significant 
> number of class loader instances required for maximum compatibility 
> (without going into more detail).
> 
> In essence, the solution I'm striving for, is to solve the problem in a 
> distributed world that OSGi solves in the JVM; segregation and isolation 
> of incompatibility while allowing compatible implementations to 
> cooperate.  However I want an implementation without commitment to any 
> particular container or module technology, so as not to force container 
> implementation choices on projects that already have their specific 
> container implementations.
> 
> Rather than reinventing another container technology,  all jar files a 
> service's client requires, could be uploaded to codebase services, just 
> prior to service registration.  The codebase service could analyse, 
> repackage and sign the jar files into compatible bundles, dynamic 
> containers if you wish, one for each ClassLoader, where each class 
> loader represents a Package API group signature.
> 
> Using the uploaded jar files, the codebase services could generate and 
> propagate analysis reports amongst themselves in a p2p fashion, such 
> that between them, they could determine the latest binary compatible 
> version of a package, such that the latest compatible version would 
> always be preferred.  Once the latest version is identified, a codebase 
> service can verify, with it's own analysis, in order to confirm and 
> report malicious or malfunctioning codebase servers.  Newer versions of 
> a Package, found to have broken Binary Backward compatibility, would be 
> kept in a separate ClassLoader as determined by their API signature, 
> thus incompatibility is isolated.  There may be subgroups within a 
> package, that could also be shared between incompatible package versions 
> to provide improved class file and object sharing.
> 
> Hence a client receiving bytecode, could choose to channel it through 
> one or more codebase servers that it has trust relationships with.  A 
> bytecode trust surrogate, the preferred codebase server could retrieve 
> required bytecode that it doesn't already posses via lookup services of 
> other codebase service locations.  The bytecode recipient would retrieve 
> analysis information detailing bytecode implementation security concerns 
> prior to loading any bytecode.  The codebase server would not execute 
> any untrusted bytecode itself, only perform analysis using the ASM 
> library, the aim would be that a codebase server was as secure as 
> possible, such that it can be considered trustworthy and as impervious 
> to attack as possible(existing denial of service attack strategies 
> require consideration).  One could even perform tests on codebases, by 
> uploading deliberately malicious code and checking resulting analysis 
> reports, or by occasionally confirming the analysis reports with other 
> codebases or using a local codebase analysis processes.  Separation of 
> concerns.
> 
> Codebase Services would only be required to maintain a copy of the 
> evolution bloodline for the latest binary backward compatible package.  
> A package fork or breaking of backward compatibility would mean storing 
> a copy of both of the latest divergent compatibility signatures, again 
> some unchanged class subgroups may be shared between them.  Java 
> Bytecode versions (compiler specific) would also dictate which package 
> version could be used safely in local JVM's.
> 
> Clients of services will have to accept a certain amount of downtime, 
> once a particular instance of a package's classes are loaded into a 
> classloader, no other compatible implementations of that package will be 
> able to be loaded, this is only a problem for long lived service client 
> processes.  Object state will need to be persisted while the JVM 
> restarts and reloads new bytecode (Serializable is also part of class 
> API). This is due to the inability of an existing ClassLoader to reload 
> classes (java debug excluded). Backward Binary compatibility doesn't 
> necessarily infer forward compatibility, classes and interfaces can add 
> methods without breaking compatibility with pre existing binaries, 
> visibility can become more visible, abstract methods can become non 
> abstract, even though some of these changes break source code 
> compatibility, old clients aren't aware of the new methods and don't 
> execute them.  For specifics see Chapter 13, Binary Compatibility of the 
> Java Language Specification, 3rd Edition, this is what I plan to base 
> the compatibility analysis upon.
> 
> It would also be possible for services to utilise codebase servers in 
> their classpath.
> 
> These issues I propose tackling are not simple obstacles, nor will they 
> be easy to implement, some issues may even be intractable, but what the 
> hell, who' with me?  That's why we got into this in the first place 
> isn't it?  The challenge!  Project Neuromancer highlighted areas for 
> improvement, if we address some of these, I believe that River can 
> become the much vaunted and dreamt of semantic web.
> 
> I want problems identified so solutions can be devised, lets see 
> objections & supporting logic or better ideas.
> 
> Cheers,
> 
> Peter.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 


Mime
View raw message