river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dennis Reedy <dennis.re...@gmail.com>
Subject Re: Maven repository Entry was Re: Codebase service?
Date Tue, 25 May 2010 12:08:54 GMT
Hi Peter,

Thanks for the detailed reply, comments interspersed below, Also, from a housekeeping point
of view (if not already done), it would be great if Jira issues could be created for items
below.

Dennis

On May 25, 2010, at 205AM, Peter Firmstone wrote:

> Hi Dennis,
> 
> Reasoning and hopefully the why's? below.

> 
> Dennis Reedy wrote:
>> Hi Peter,
>> 
>> I was hoping to take a step back for a second, perhaps its just me that seems to
have my head spinning of late on this list. I may have missed some things, but we've discussed
many issues over the past week:
>> 
>> - How to advertise the DL jar(s) a service vends, allowing a client to download requisite
jars that allow the jars to be loaded from a local (trusted) location
>>  
> Yes, we can use an Entry, or as Chris pointed out, if we annotate MarshalledInstance's
using a new Maven URL schema we can extract that info and make it available via MarshalledServiceItem
(An abstract class that extends ServiceItem).

I dont think a new Maven URL schema has actually been proposed? Why wouldnt we just use a
String attribute in an Entry that is of the form groupId:artifactId:version:classifier?

> 
>> - Given the capability above, the need for a codebase service may not be required
>>  
> Agreed
>> - Conventions on how to develop River services, as it relates to jar naming, packaging
and what dependencies are between the various artifacts
>> - How to possibly move forward with utilizing Maven repositories and the implied
capabilities of published artifacts
>> - The development of a maven archetype to allow a developer to easily create a working
project in seconds
>>  
> Yes to all above.
>> Your attention to detail and the documentation of how class loader interactions with
regards to security is great. I'd like to understand the requirements of what you have documented
below, the urge to refactor MarshalledInstance, and why the new class loader hierarchy needs
to be added to River.
>>  
> 
> The urge to refactor MarshalledInstance is to allow the URL annotation to be requested
directly and passed via StreamServiceRegistrar and combined with delayed unmarshalling of
proxy's via MarshalledServiceItem, to allow the client to provision and provide an alternate
CodeSource if need be.

So this is related to the first bullet above, allowing a client to download requisite jars.
I dont see why MarshalledInstance needs refactoring if we already have the jar(s)/artifact
that can be provisioned. In the case of an artifact, it may not matter what the MarshalledInstance
provides, because the artifact's location will most likely be in a repository.

> 
> StreamServiceRegistrar returns a ResultStream<ServiceItem> , so you have check
with instanceof MarshalledServiceItem.
> 
> The new packaging Scheme

packaging scheme?

> can be applied to distributed objects also, provided we create an implementation of CodebaseAccessClassLoader
(contributed by Gregg to replace RMIClassLoaderSPI) that performs or requests local Maven
archive provisioning.

As I have pointed out earlier you'll need more information on where to get the artifacts from,
specifically the maven repositories to access. I dont know if the RMICLassLoaderSpi is the
right place to put this added functionality or not at this time.

> 
> The new ClassLoader hierarchy is needed, to solve class identity (fully qualified runtime
classname = class + ClassLoader), class visibility, isolation and versioning problems, that
PreferredClassProvider partially solves.

>> Perhaps I'm just missing some fundamental issues, but maybe we need to take some
time and determine the whys before the hows? Is this direction fundamental to the OSGi direction
that you're taking? If so, how does this impact non-OSGi based systems?
>>  
> The changes are OSGi agnostic, OSGi will live in the application space, so while they
benefit OSGi, they are independent of it, so the same benefits will apply to other software
and OSGi isn't required.
> 
> I realised that fundamentally OSGi uses ClassLoaders for isolating software into components,
so implementation classes aren't exposed outside of their module, something which OSGi does
very well, it also manages security concerns very well.  Something else I realised, OSGi's
use of ClassLoaders is not optimum for distributed systems, there are difficulties determining
the correct ClassLoader during deserialization. OSGi wasn't designed with Serialization in
mind.  Distributed computing introduces another dimension, like going from 2D to 3D,  in OSGi,
you only have one bundle version combination loaded (you can have many bundles of different
versions but I believe typically only one of each unique bundle instance, you can have the
same package version exported by differently versioned bundles). So how do you determine the
correct ClassLoader during unmarshalling.  In River we may have many proxy's using the same
jar version, however we don't want the proxy's implementation to get all tied up in the local
application bundles, we'd be allowing the smart proxy to pollute the local application space,
some parts of the local application could see the proxy implementation.
> 
> In our new ClassLoader tree, a smart proxy can have it's own personal ClassLoader,

As it would today through the class loader the RMIClassLoaderSPi returns right?

> because the ContextClassLoader will be that of the proxy's during returning object deserialization,
since it initiated the communication with the remote Service host.  The reason a clients parameter
implementation cannot have it's own ClassLoader and must share with other clients that use
the same codebase and version is that they have no link to the ClassLoader at the remote Service
host, with ony the Codebase and Version to go by, since they didn't initiate the communication,
there could otherwise be many ClassLoaders containing that codebase version, there not enough
information to find it, the last thing I want to do is require the client have an identity
or location to deal with that deserialization of parameters at the Service node.

If the client has the service-api.jar in it's classpath, why are there issues surrounding
the client's parameter implementation's class loader?

> 
> Rather than take, "how you use OSGi" and apply it to River, I decided to understand why
they solved their problems the way they did and learn from it.  It is a very good solution
to the problem they've solved.  However with our solution we can solve the deserialization
issue for distributed applications utilising OSGi.
> 
> Currently River uses Permission grants based on ClassLoader, (so does OSGi), what I realised
was I needed a finer grained Permission grant and having many ProtectionDomain's inside one
ClassLoader is about as fine as you can get.  Only one ClassLoader is used for the API space
for class identity reasons, to allow maximum sharing of API classes because you just can't
control and coordinate someone else's JVM's ClassLoader visibility, without overcoming some
serious trust issues (Simpler is better I don't even want to attempt to solve them!). There
is however one compromise with my approach.
> 
> By loading all API classes into the same ClassLoader, we cannot have duplicate classes,
so we must always load the latest API version, that must not break backward compatibility.
If the backward compatibility constraints are hampering your design, it's simply better to
deprecate a package and append a number to change the package name.  (Or create a completely
new API jar)

If I understand correctly I think this is the crux of the issue. I dont understand why you
need to load all API classes with the same class loader. FWIW, in Rio we handle the loading
(and unloading) of services with the following structure (http://www.rio-project.org/apidocs/org/rioproject/boot/package-summary.html#package_description):

                  AppCL
                    |
            CommonClassLoader (http:// URLs of common JARs)
                    +
                    |
                    +
            +-------+-------+----...---+
            |               |          |
        Service-1CL   Service-2CL  Service-nCL
        

AppCL - Contains the main() class of the container. Main-Class in manifest points to com.sun.jini.start.ServiceStarter
Classpath:  boot.jar, start.jar, jsk-platform.jar
Codebase: none

CommonClassLoader - Contains the common Rio and Jini technology classes (and other declared
common platform JARs) to be made available to its children.
Classpath: Common JARs such as rio.jar
Codebase: Context dependent. The codebase returned is the codebase of the specific child CL
that is the current context of the request.

Service-nCL - Contains the service specific implementation classes.
Classpath: serviceImpl.jar
Codebase: "serviceX-dl.jar rio-dl.jar jsk-lib-dl.jar"

Certainly not as sophisticated as OSGi (or what you are targeting), but it meets the requirements
of allowing multiple service versions, applying security context per class loader using the
same approach as ActivateWrapper, and allows the JVM to stay running. 

> 
> org.some.thing
> org.some.thing2
> 
> The reason we version packages is so we don't have to rename them when they break backward
compatibility, this makes sense for implementations, but not API.  If your going to have long
lived persistent objects they belong in the API space, if you don't need to persist your objects,
why not have an interface and throwaway class implementations, this solves Serialization exposing
class internal state and evolution.  Extend the interface if you wan't new methods.
> 
> If a JVM has been running a long time, a new API version may have been released, clients
using the old API functionality only, won't be able to see or utilise the new functionality
until we restart the jvm.  That is the compromise.  But I figure it's not too bad a compromise
once API's have stabilised and go into longer development cycles.  I can handle having to
restart my JVM once every 6 months.
> 
> I think Michael Warres got to the crux of the problem with his publication on ClassLoader
issues, my interpretation of what he said, is perhaps java should tear apart the multiple
ClassLoader concerns, of Security, Isolation and Identity and start again.  I've chosen what
appears to me to be the best compromise based on Java ClassLoader's today.
> 
> So this new ClassLoader hierarchy should play nice with Maven,

I would suggest it has nothing to do with Maven. Maven is just used as a repository for us,
and optionally as a way to build services.


> OSGi and other stuff too, because now the API is visible to everything below in the ClassLoader
hierarchy, while the implementations below, don't expose themselves, instead, everything cooperates
through the API.
> 
> OSGi can be used to synchronize ClassLoader visibility between two separate JVM's, however
that still requires the implementer deal with deserialization issues, with our solution, we
won't have to worry much about ClassLoader issues.  With Maven, we won't have to worry about
lost codebases either.
> 
> Yep, it has been a bit of a head spin, needed your help to work out the details before
I forgot them.
> 
> There is one more detail, I'd like to include in the jar archive: a list of permissions
the jar needs.  I'd like to use the same format OSGi uses, because it's been done before,
why be different.  This is to solve the: "what grants does it need?" Problem. So we can minimise
permission grants.

Yes, I'm interested in this as well.


Mime
View raw message