river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Firmstone <j...@zeus.net.au>
Subject Re: Moving River into the Semantic Web with Codebase Services & Bytecode Analysis services.
Date Tue, 08 Sep 2009 23:14:02 GMT
Look forward to it mate,

N.B. this line should read:

   * Codebase surrogates, for objects originating from periodically
     disconnected services for clients to obtain their bytecode (they 
also require Refreshable References and



Gregg Wonderly wrote:
> Peter, I want to write up some questions and thoughts about this post, 
> but can't do that right now, hopefully I can in a day or so.
> Gregg Wonderly
> Peter Firmstone wrote:
>> I've had some more thoughts on Codebase services after spending time 
>> researching & reflecting.
>> Issues I'd like to see addressed or simplified using Codebase services:
>>    * Codebase loss
>>    * Codebase replication
>>    * Codebase upgrades
>>    * Codebase configuration
>>    * Codebase surrogates, for objects originating from periodically
>>      disconnected clients (they also require Refreshable References and
>>      Xuid's)
>>    * Bytecode Dependency Analysis & API signature identification, for
>>      Package & Class Binary Compatiblity & ClassLoader Isolation
>>    * Bytecode Static Security Analysis, repackaging & code signing.
>> On the last issue I've had some thoughts about Code bases being able 
>> to act as a trust mediator to receive, analyse, repackage, sign and 
>> forward bytecode on behalf of clients.  The last two items above fit 
>> into the category of Bytecode Analysis service responsibilities for 
>> codebases.  Prior to loading class files, a client can have a trust 
>> relationship with one or more preferred codebase providers.  A code 
>> base provider also provides bytecode static analysis services for 
>> security and binary compatibility purposes.
>> I got thinking about this solution after reading about service proxy 
>> circular code verification issues for disconnected clients that 
>> project neuromancer exposed.  A surrogate security verifier as well 
>> as a codebase surrogate.
>> All this would be implemented with minimal changes to services and 
>> clients configurations and no change to third party library code, 
>> unlike my evolving objects framework proposals.
>> After receiving a tip off from Michael Warres, Tim Blackman was 
>> gracious enough to share learnings from his research on class loader 
>> tree's.  Tim built a prototype system using message digests and was 
>> considering implementing textual Class API signatures for identifying 
>> compatibility between different class bytecode's.  Tim considered the 
>> textual API signatures when he found independent vendor compiler 
>> optimisations produced different bytecode, hence different SHA-1 
>> signatures, although they have identical and compatible class API.  I 
>> thought about this further and realised that Binary Compatiblity for 
>> class files and package change is far more flexible than source code 
>> compatibility.  While Tim concentrated on API compatibility for 
>> ensuring objects that should be shared, could be, he found that 
>> groups of class files, based on dependency analysis (this is where 
>> the replacement ClassDep code came from), required their own 
>> ClassLoader's, hence there are a significant number of class loader 
>> instances required for maximum compatibility (without going into more 
>> detail).
>> In essence, the solution I'm striving for, is to solve the problem in 
>> a distributed world that OSGi solves in the JVM; segregation and 
>> isolation of incompatibility while allowing compatible 
>> implementations to cooperate.  However I want an implementation 
>> without commitment to any particular container or module technology, 
>> so as not to force container implementation choices on projects that 
>> already have their specific container implementations.
>> Rather than reinventing another container technology,  all jar files 
>> a service's client requires, could be uploaded to codebase services, 
>> just prior to service registration.  The codebase service could 
>> analyse, repackage and sign the jar files into compatible bundles, 
>> dynamic containers if you wish, one for each ClassLoader, where each 
>> class loader represents a Package API group signature.
>> Using the uploaded jar files, the codebase services could generate 
>> and propagate analysis reports amongst themselves in a p2p fashion, 
>> such that between them, they could determine the latest binary 
>> compatible version of a package, such that the latest compatible 
>> version would always be preferred.  Once the latest version is 
>> identified, a codebase service can verify, with it's own analysis, in 
>> order to confirm and report malicious or malfunctioning codebase 
>> servers.  Newer versions of a Package, found to have broken Binary 
>> Backward compatibility, would be kept in a separate ClassLoader as 
>> determined by their API signature, thus incompatibility is isolated.  
>> There may be subgroups within a package, that could also be shared 
>> between incompatible package versions to provide improved class file 
>> and object sharing.
>> Hence a client receiving bytecode, could choose to channel it through 
>> one or more codebase servers that it has trust relationships with.  A 
>> bytecode trust surrogate, the preferred codebase server could 
>> retrieve required bytecode that it doesn't already posses via lookup 
>> services of other codebase service locations.  The bytecode recipient 
>> would retrieve analysis information detailing bytecode implementation 
>> security concerns prior to loading any bytecode.  The codebase server 
>> would not execute any untrusted bytecode itself, only perform 
>> analysis using the ASM library, the aim would be that a codebase 
>> server was as secure as possible, such that it can be considered 
>> trustworthy and as impervious to attack as possible(existing denial 
>> of service attack strategies require consideration).  One could even 
>> perform tests on codebases, by uploading deliberately malicious code 
>> and checking resulting analysis reports, or by occasionally 
>> confirming the analysis reports with other codebases or using a local 
>> codebase analysis processes.  Separation of concerns.
>> Codebase Services would only be required to maintain a copy of the 
>> evolution bloodline for the latest binary backward compatible 
>> package.  A package fork or breaking of backward compatibility would 
>> mean storing a copy of both of the latest divergent compatibility 
>> signatures, again some unchanged class subgroups may be shared 
>> between them.  Java Bytecode versions (compiler specific) would also 
>> dictate which package version could be used safely in local JVM's.
>> Clients of services will have to accept a certain amount of downtime, 
>> once a particular instance of a package's classes are loaded into a 
>> classloader, no other compatible implementations of that package will 
>> be able to be loaded, this is only a problem for long lived service 
>> client processes.  Object state will need to be persisted while the 
>> JVM restarts and reloads new bytecode (Serializable is also part of 
>> class API). This is due to the inability of an existing ClassLoader 
>> to reload classes (java debug excluded). Backward Binary 
>> compatibility doesn't necessarily infer forward compatibility, 
>> classes and interfaces can add methods without breaking compatibility 
>> with pre existing binaries, visibility can become more visible, 
>> abstract methods can become non abstract, even though some of these 
>> changes break source code compatibility, old clients aren't aware of 
>> the new methods and don't execute them.  For specifics see Chapter 
>> 13, Binary Compatibility of the Java Language Specification, 3rd 
>> Edition, this is what I plan to base the compatibility analysis upon.
>> It would also be possible for services to utilise codebase servers in 
>> their classpath.
>> These issues I propose tackling are not simple obstacles, nor will 
>> they be easy to implement, some issues may even be intractable, but 
>> what the hell, who' with me?  That's why we got into this in the 
>> first place isn't it?  The challenge!  Project Neuromancer 
>> highlighted areas for improvement, if we address some of these, I 
>> believe that River can become the much vaunted and dreamt of semantic 
>> web.
>> I want problems identified so solutions can be devised, lets see 
>> objections & supporting logic or better ideas.
>> Cheers,
>> Peter.

View raw message