river-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Firmstone (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (RIVER-316) RFC Library, Application & Class Versioning, Dynamically Mobile Codebases and Classloading enhancements
Date Tue, 28 Jul 2009 21:54:15 GMT

    [ https://issues.apache.org/jira/browse/RIVER-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12735046#action_12735046
] 

Peter Firmstone edited comment on RIVER-316 at 7/28/09 2:53 PM:
----------------------------------------------------------------

An Object's Class Type is the fully qualified class name + the originating ClassLoader.

HTTP codebase's are part of the problem, the URLClassLoader is fixed in the object's Type
(class identity), which may change over time.  Michael Warres addressed this problem by creating
a dynamic codebase service where the URL was a cryptographic hash checksum of the jar file
(stored data) identity.

Michael made a presentation on Service based codebases, apparently not much code was required
to implement it.  We cannot directly copy the code (interfaces etc) from the presentation
due to copyright, although we can produce functionally equivalent code.

http://www.jini.org/files/meetings/eighth/presentations/Warres/Warres.pdf

So the dynamic service based codebase could move around, and be offered redundantly.

In addition, we could update/upgrade/replace the Hierarchical based PreferredClassLoader relationship
with a more flexible DynamicVersioningClassLoader based on ClassWorlds to segregate incompatible
class packages while granting compatible classes the ability to communicate.  There is BSD
licensed code that we could build on:

http://classworlds.codehaus.org/  ClassWorlds has some very elegant simple models (easy to
code & use) that may help us.

>From the above website: "The ClassWorlds model does away with the hierarchy normally associated
with ClassLoaders.  Instead, there is a pool of ClassRealms <http://classworlds.codehaus.org/apidocs/com/codehaus/classworlds/ClassRealm.html>
which can import arbitrary packages from other ClassRealms. Effectively, ClassWorlds turns
the old-style hierarchy into a directed graph."

One might give a library its own ClassLoader in each JVM for instance, then we could make
that library available to the applications / services that depended upon it. A later version
of that library would have a separate ClassLoader so that "Jar hell" or "Classpath Hell" (standalone
JVM talk) or its distributed equivalent "Class Type Hell" or "ClassLoader Hell" are avoided
(unfair: ClassLoaders are a blessing in disguise, they add another dimension to namespaces).

The new com.sun.jini.tool.classdepend package has functionality to record class dependency
relationships in an array, this could be modified to also store each classes unique sha-1
or md5 hash checksum, as well as serialVersionUID's, if they implement Serialisable or Externalizable,
this dependency array or a similar class name string indexed HashMap could be returned on
request as part of the codebase service.

I purchased a copy of the following paper online (Thank's to Jim Waldofor the tip, no endorsements
implied), I found a freely available copy you can all read.  Its called Modular Software Upgrades
for Distributed Sytems by Sameer Ajmant, Barbara Liskov and Liuba Shrira.  It discusses updating
services in distributed systems.  It is a course grained versioning system.

http://www.pmg.csail.mit.edu/~ajmani/papers/ecoop06-upgrades.pdf

So I'm currently trying to get my head around a new ClassLoader framework for classes where
a developer knows which objects need to be preserved over time, while their underlying class
implementations may change, this would be a fine grained versioning system, at the Class Level.
 The new package com.sun.jini.tools.classdepend can return a dependency tree stored in an
array, this could be extended to record each classes unique sha-1 or md5 hash checksum, as
well as serialVersionUID's, if they implement Serialisable or Externalizable and if a Class
implements VersionedClass, it already stores the fully qualified class name.  The sha-1 or
md5 hash checksum would also form part of Security of downloaded code as this could be checked
before loading the Class file.

Versioning, Identity and preservation of object state / contents over time is much harder
and is unsolved. Jim Waldo recommends Classes implement interfaces, where interfaces are used
for Type.  This allows objects from different ClassLoaders that implement the same interface
to interact and be interchanged.   For instance if you have a Class called Money.class (version
1) and you reimplement its methods and add additional methods in Money.class(version 2), the
only way the objects can be used in the same array etc is if they share a common Interface
or ancestor class.  If you inherit Money.class (version 1) and call it MoneyTree.class you
can override all replaced methods and add additional methods and it can be used as a money
object, however you can't use the new additional methods when in company with the original
class Money's objects, only those that existed prior.

To work towards solving this problem, I'm thinking about a ClassLoader Versioning Framework
for Objects we want to distribute and preserve over time, preserving their state and contents
while retaining the ability to upgrade their class files or bytecodes and also changing their
type, using Interfaces (mixins) to enable interoperability between objects with different
types. Serialization can be used to upgrade objects class file bytecodes (marshall, unmarshall
into a replacement ClassLoader) with required visibility granted by what we can build on using
ClassWorlds.  Any objects linking to the VersionedClass Object instance could be strongly
linked indirectly via a reference object that was updated with the VersionedObject's new hard
reference location, all other objects could be strongly linked to the ReferenceObject and
retrieve a weak link from the reference object to the VersionedObject.  The ClassLoader Versioning
Framework would keep a weak reference to each ReferenceObject with a Lease, after the Lease
expires, the ClassLoader would check if that object still existed (weakly referenced, may
have been garbage collected) and if so, check if its Class file has been updated via an Update
Service, which returns the hash code for the update Class file.   Alternately this could be
requested via a ClassLoader method,  each time a weak reference is requested through the ReferenceObject.
The VersioningClassLoader would then, with the hashcode, request a codebaseURL from the Codebase
Service and create the new version in another ClassLoader, the new VersionedObject would be
strongly referenced by the existing ReferenceObject, leaving the old VersionedObject with
no strong reference, to be Garbage Collected.

A Mutable object would require a Transaction Manager.

Identity is more difficult, for objects who's identity is sufficiently determined by the equals()
and hashCode() methods this should be sufficient.  However objects that require a unique identity
can be broken down again into two types:

1. Immutable objects where Object.equals() doesn't determine object Identity.
2. Mutable objects where Object.equals() doesn't determine object Identity.

I think #1 could be handled by an ObjectUniqueID service that provides three unique numbers
(or two separate services? a TimeService and a RandomNumberGeneratorService); The time in
milliseconds at the time of request and two random numbers, one could be reirieved locally.
The object would receive this service at instantiation time if required.  The likelihood that
two random numbers and the time in milliseconds would produce a match for the same fully quallified
class name would be vary rare indeed.  It might be desireable that all VersionedObjects contain
UniqueID fields, versionObjectCreationTimeID, versionRemoteRandomID and versionLocalRandomID.

Well #2 would be more complex, this object would need a Transaction Manager, as well as implement
the requirements of #1.

when checking for VersionedObject.equals(), this must check prior to any other calcs that
the object is an instance of each interface implemented.

VersionedObject's, could be instantiated by:

public interface VersioningClassLoader {

   public ReferenceObject instantiate(Builder builderObject, String fullyQualifiedClassName){};

}

The builder Object's Class would have to implement the same interfaces as the VersionedClass.

public Constructors are not permitted as these return strong references, in order to prevent
the JVM providing one, they must implement a private constructor only.

Each VersionedClassLoader would be garbage collected after all VersionedObjects it manages
(Class Leases have expired) go out of scope (no strong references left), objects whose Class
leases have expired would be migrated to new ClassLoaders (if any classes have been updated
in the package) on, each classloader might have a maximum age determined by maximum Class
Lease time  (configurable eg 42 days max) +  a grace time window during which it can be given
new Class Leases, (probably enough time for all VersionedClasses in a dependency tree to be
downloaded, (eg 10 days).  Once the ClassLoader is eligible for GarbageCollection, all Class
files loaded by that ClassLoader are also eligible for Garbage Collection.  VersionedObjects
that haven't been garbage collected at the time of Class Lease expiry, would be migrated to
their new ClassLoader, as a background process.  Perhaps a private static counter method in
the ReferenceObject, might prioritise the loading of the objects into the new ClassLoader
by the number of hits a Class might receive. Other objects that are required that haven't
yet be transferred would jump the que on a demand basis.

A package upgrade check could be triggered upon receiving a new class file version.  This
may exclude using Leases, instead a weak reference would return null.  Undecided which way
to go yet.

All objects that the VersionedClass Objects instance fields reference that are declared transient,
would be considered supporting objects and would go out of scope once the VersionedClass Objects
do, for example a new class file upgrade might dictate another version of an external library,
new objects required by the VersionedObject would have to be created during unmarshalling
and the new class files for the library downloaded via the codebase Service, the codebase
service would also advertise via the service, the dependency tree it contains, so the VersioningClassLoader
Framework would determine the Libraries suitability and make it available via a Library ClassLoader
Realm with ClassWorlds if it isn't already available.  

All objects upon unmarshalling would be checked to see if their class currently exists in
a ClassLoader and if the Object already resides in memory based on Lease Validity, identity
or equality and if so the in memory object used.

A Versioned Object would only be permitted to keep references to objects that meet the following
criteria:

  1.  Non versioned and Non Serializable objects that are declared transient and can be reconstructed
after unmarshalling.
  2.  JVM native implementations that implement Serializable or Externalizable.
  4.  Package Private Versioned Objects, A MobileObjectDelegate, or other VersionedObjects.


Once the basic VersionedClass, MobileObjectDelegate, DynamicVersionedClassLoader  interfaces
and implementations agreed upon, a number of useful utility VersionedClass implementations
could be created, for developers to utilise, preferably Interfaces and their implementations
for immutable objects, such as Currency, Quantity, Unit, Money, etc.

What I'm talking about is conceptual and experimental, I'm hoping others will be able to provide
some thoughts / input, assist and see if we can't produce something useful along with the
changes you've made and lessons learned.  Or alternatively tell me I'm totally nuts ;)

Please raise any issues you can see so they can be addressed, the more earlier on the better.


      was (Author: pfirmst):
    An Object's Class Type is the fully qualified class name + the originating ClassLoader.

HTTP codebase's are part of the problem, the URLClassLoader is fixed in the object's Type
(class identity), which may change over time.  Michael Warres addressed this problem by creating
a dynamic codebase service where the URL was a cryptographic hash checksum of the jar file
(stored data) identity.

Michael made a presentation on Service based codebases, apparently not much code was required
to implement it.  We cannot directly copy the code (interfaces etc) from the presentation
due to copyright, although we can produce functionally equivalent code.

http://www.jini.org/files/meetings/eighth/presentations/Warres/Warres.pdf

So the dynamic service based codebase could move around, and be offered redundantly.

In addition, we could update/upgrade/replace the Hierarchical based PreferredClassLoader relationship
with a more flexible DynamicVersioningClassLoader based on ClassWorlds to segregate incompatible
class packages while granting compatible classes the ability to communicate.  There is BSD
licensed code that we could build on:

http://classworlds.codehaus.org/  ClassWorlds has some very elegant simple models (easy to
code & use) that may help us.

>From the above website: "The ClassWorlds model does away with the hierarchy normally associated
with ClassLoaders.  Instead, there is a pool of ClassRealms <http://classworlds.codehaus.org/apidocs/com/codehaus/classworlds/ClassRealm.html>
which can import arbitrary packages from other ClassRealms. Effectively, ClassWorlds turns
the old-style hierarchy into a directed graph."

One might give a library its own ClassLoader in each JVM for instance, then we could make
that library available to the applications / services that depended upon it. A later version
of that library would have a separate ClassLoader so that "Jar hell" or "Classpath Hell" (standalone
JVM talk) or its distributed equivalent "Class Type Hell" or "ClassLoader Hell" are avoided
(unfair: ClassLoaders are a blessing in disguise, they add another dimension to namespaces).

The new com.sun.jini.tool.classdepend package has functionality to record class dependency
relationships in an array, this could be modified to also store each classes unique sha-1
or md5 hash checksum, as well as serialVersionUID's, if they implement Serialisable or Externalizable,
this dependency array or a similar class name string indexed HashMap could be returned on
request as part of the codebase service.

I purchased a copy of the following paper online (Thank's to Jim Waldofor the tip, no endorsements
implied), I found a freely available copy you can all read.  Its called Modular Software Upgrades
for Distributed Sytems by Sameer Ajmant, Barbara Liskov and Liuba Shrira.  It discusses updating
services in distributed systems.  It is a course grained versioning system.

http://www.pmg.csail.mit.edu/~ajmani/papers/ecoop06-upgrades.pdf

So I'm currently trying to get my head around a new ClassLoader framework for classes where
a developer knows which objects need to be preserved over time, while their underlying class
implementations may change, this would be a fine grained versioning system, at the Class Level.
 The new package com.sun.jini.tools.classdepend can return a dependency tree stored in an
array, this could be extended to record each classes unique sha-1 or md5 hash checksum, as
well as serialVersionUID's, if they implement Serialisable or Externalizable and if a Class
implements VersionedClass, it already stores the fully qualified class name.  The sha-1 or
md5 hash checksum would also form part of Security of downloaded code as this could be checked
before loading the Class file.

Versioning, Identity and preservation of object state / contents over time is much harder
and is unsolved. Jim Waldo recommends Classes implement interfaces, where interfaces are used
for Type.  This allows objects from different ClassLoaders that implement the same interface
to interact and be interchanged.   For instance if you have a Class called Money.class (version
1) and you reimplement its methods and add additional methods in Money.class(version 2), the
only way the objects can be used in the same array etc is if they share a common Interface
or ancestor class.  If you inherit Money.class (version 1) and call it MoneyTree.class you
can override all replaced methods and add additional methods and it can be used as a money
object, however you can't use the new additional methods when in company with the original
class Money's objects, only those that existed prior.

To work towards solving this problem, I'm thinking about a ClassLoader Versioning Framework
for Objects we want to distribute and preserve over time, preserving their state and contents
while retaining the ability to upgrade their class files or bytecodes and also changing their
type, using Interfaces (mixins) to enable interoperability between objects with different
types. Serialization can be used to upgrade objects class file bytecodes (marshall, unmarshall
into a replacement ClassLoader) with required visibility granted by what we can build on using
ClassWorlds.  Any objects linking to the VersionedClass Object instance could be strongly
linked indirectly via a reference object that was updated with the VersionedObject's new hard
reference location, all other objects could be strongly linked to the ReferenceObject and
retrieve a weak link from the reference object to the VersionedObject.  The ClassLoader Versioning
Framework would keep a weak reference to each ReferenceObject with a Lease, after the Lease
expires, the ClassLoader would check if that object still existed (weakly referenced, may
have been garbage collected) and if so, check if its Class file has been updated via an Update
Service, which returns the hash code for the update Class file.   Alternately this could be
requested via a ClassLoader method,  each time a weak reference is requested through the ReferenceObject.
The VersioningClassLoader would then, with the hashcode, request a codebaseURL from the Codebase
Service and create the new version in another ClassLoader, the new VersionedObject would be
strongly referenced by the existing ReferenceObject, leaving the old VersionedObject with
no strong reference, to be Garbage Collected.

A Mutable object would require a Transaction Manager.

Identity is more difficult, for objects who's identity is sufficiently determined by the equals()
and hashCode() methods this should be sufficient.  However objects that require a unique identity
can be broken down again into two types:

1. Immutable objects where Object.equals() doesn't determine object Identity.
2. Mutable objects where Object.equals() doesn't determine object Identity.

I think #1 could be handled by an ObjectUniqueID service that provides three unique numbers
(or two separate services? a TimeService and a RandomNumberGeneratorService); The time in
milliseconds at the time of request and two random numbers, one could be reirieved locally.
The object would receive this service at instantiation time if required.  The likelihood that
two random numbers and the time in milliseconds would produce a match for the same fully quallified
class name would be vary rare indeed.  It might be desireable that all VersionedObjects contain
UniqueID fields, versionObjectCreationTimeID, versionRemoteRandomID and versionLocalRandomID.

Well #2 would be more complex, this object would need a Transaction Manager, as well as implement
the requirements of #1.

when checking for VersionedObject.equals(), this must check prior to any other calcs that
the object is an instance of each interface implemented.

VersionedObject's, could be instantiated by:

public interface VersioningClassLoader {

   public ReferenceObject instantiate(Builder builderObject, String fullyQualifiedClassName){};

}

The builder Object's Class would have to implement the same interfaces as the VersionedClass.

Or perhaps by implementing static newInstance() methods.  That is retrieve a weak reference
to the Class object from the VersionedClassLoader and execute an instance method rather than
use constructors, the newInstance() method must request a new reference object from its VersionedClassloader,
and pass a copy of  itself (this) to the referenceObject which is then returned to the caller.
 The ReferenceObject must use a ReadWriteLock on the reference it keeps to the VersionedObject,
to prevent the method ReferenceObject.getWeakRef() from accessing it during updating.

public Constructors are not permitted as these return strong references, in order to prevent
the JVM providing one, they must implement a private constructor only.

/* This suggestion is bad, the VersionedPublicClass Object reference must not escape or be
published, instead this
must be handled by an MobileObjectDelegate wraper.  See code and javadoc for further clarification
of latest implementation.

Calling methods on the VersionedObject could be done by:
ReferenceObject ob = versionedClassLoaderInstance.instantiate(builder, "my.package.classname");
ACommonInterface foo = ob.getWeakRef(); // The ReferenceObject checks the lease is valid first.

or to execute some method:

ob.getWeakRef().someMethod();

We could also ask the object for its Class file lease expiry after retrieving an weak reference
to it, this should be an interface method for VersionedObject

When all strong links to the ReferenceObject  go out of scope, the ReferenceObject (the only
object that keeps a strong reference to the VersionedObject) and the VersionedObject can be
garbage collected.
*/

Each VersionedClassLoader would be garbage collected after all VersionedObjects it manages
(Class Leases have expired) go out of scope (no strong references left), objects whose Class
leases have expired would be migrated to new ClassLoaders (if any classes have been updated
in the package) on, each classloader might have a maximum age determined by maximum Class
Lease time  (configurable eg 42 days max) +  a grace time window during which it can be given
new Class Leases, (probably enough time for all VersionedClasses in a dependency tree to be
downloaded, (eg 10 days).  Once the ClassLoader is eligible for GarbageCollection, all Class
files loaded by that ClassLoader are also eligible for Garbage Collection.  VersionedObjects
that haven't been garbage collected at the time of Class Lease expiry, would be migrated to
their new ClassLoader, as a background process.  Perhaps a private static counter method in
the ReferenceObject, might prioritise the loading of the objects into the new ClassLoader
by the number of hits a Class might receive. Other objects that are required that haven't
yet be transferred would jump the que on a demand basis.

A package upgrade check could be triggered upon receiving a new class file version.  This
may exclude using Leases, instead a weak reference would return null.  Undecided which way
to go yet.

All objects that the VersionedClass Objects instance fields reference that are declared transient,
would be considered supporting objects and would go out of scope once the VersionedClass Objects
do, for example a new class file upgrade might dictate another version of an external library,
new objects required by the VersionedObject would have to be created during unmarshalling
and the new class files for the library downloaded via the codebase Service, the codebase
service would also advertise via the service, the dependency tree it contains, so the VersioningClassLoader
Framework would determine the Libraries suitability and make it available via a Library ClassLoader
Realm with ClassWorlds if it isn't already available.  

All objects upon unmarshalling would be checked to see if their class currently exists in
a ClassLoader and if the Object already resides in memory based on Lease Validity, identity
or equality and if so the in memory object used.

A Versioned Object would only be permitted to keep references to objects that meet the following
criteria:

  1.  Non versioned and Non Serializable objects that are declared transient and can be reconstructed
after unmarshalling.
  2.  JVM native implementations that implement Serializable or Externalizable.
  4.  A ReferenceObject, or ReferenceObjects for other VersionedObjects.

A ReferenceObject implementation would be provided by the VersionedObject's API and not be
extendible / inheritable.

Once the basic VersionedClass, ReferenceObject, DynamicVersionedClassLoader  interfaces and
implementations agreed upon, a number of useful utility VersionedClass implementations could
be created, for developers to utilise, preferably Interfaces and their implementations for
immutable objects, such as Currency, Quantity, Unit, Money, etc.

What I'm talking about is conceptual and experimental, I'm hoping others will be able to provide
some thoughts / input, assist and see if we can't produce something useful along with the
changes you've made and lessons learned.  Or alternatively tell me I'm totally nuts ;)

Please raise any issues you can see so they can be addressed, the more earlier on the better.

  
> RFC Library, Application & Class Versioning, Dynamically Mobile Codebases and Classloading
enhancements
> -------------------------------------------------------------------------------------------------------
>
>                 Key: RIVER-316
>                 URL: https://issues.apache.org/jira/browse/RIVER-316
>             Project: River
>          Issue Type: New Feature
>          Components: net_jini_loader
>         Environment: All
>            Reporter: Peter Firmstone
>         Attachments: classworlds-1.0-src.zip, Java Classloader issues relating to Jini
smli_tr-2006-149.pdf, VersionedDynamicClassesRev2.tgz, VersionedDynamicClassesRev3.tgz
>
>
> Request for Comments:
> Proposal to add support for Dynamic Mobile Codebases and Application fine grained class
versioning as well as Coarse grained Library versioning , to enable River User devolopers,
to provide distinction between classes with the same fully qualified class name when code
differences created by refactoring packages or library updates break backward compatibility
between classes contained within that library or package.  ClassWorlds can be used to segregate
ClassRealms for application packages and different library versions.
> A dependency tree array object (contains dependency references between classes, fully
qualified class names are stored as String objects) returned by the new ClassDepend tool (replacement
of classdep functionality) may be suitable (with some modification) for recording class versioning,
for later navigation of the codebase for class version verification, perhaps this could be
stored in serialized form with the codebase.
> The ASM library might be used to modify existing, externally sourced library class file
bytecodes to add a LIBRARYVERSIONID static final field, with an accessor method, for library
code used in codebases, to mark the class files with the library release version.
> serialVersionUID (when it exists), along with the CLASSVERSION static field, might be
used to determine the dependency and backward compatibility of classes in a codebase, this
information could be stored in the dependency tree along with the CLASSVERSION, fully qualified
class name and class file checksum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message