cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leo Simons <leosim...@apache.org>
Subject [RT][merlin][cocoon] blocks, auto-assembly, versioning
Date Wed, 16 Apr 2003 21:48:58 GMT
Preface
=======

[RT] are Ramdom Thoughts.  This is a tradition in the Cocoon community. 
  RTs are basically long and thought-provocing mails with new project 
propositions, that are discussed and scrutinized at length.  One 
distinguishing characteristic about RTs is the complete and utter lack 
of consistency with respect to quality: some are pure crap, others are 
pure genius.  Even the original author of a RT is not sure which 
category any given posting falls into at the time it is issued.  This 
posting is no exception.

(preface by Sam Ruby :)

Context
=======
I am looking at the experimental merlin container 
(avalon.apache.org/sandbox/merlin) in development @ avalon, in 
particular comparing it to phoenix and fortress (two other containers @ 
avalon). I am looking at the decomposition/assembly semantics at the moment.

from the merlin docs:

                     Blocks enable
                     the separation of components as an implementation
                     solution from services that are established by those
                     components.

That means zip to me, however, I have previously understood from Steve 
that the merlin notion of a block is similar to the cocoon notion of a 
block. Links to talks about cocoon blocks are at 
http://wiki.cocoondev.org/Wiki.jsp?page=Blocks. I've just read Stefano's 
RT on that (on that wiki as well), and what strikes me is the *huge* 
resamblance to phoenix .sar and .bar files.

This kinda provoked thoughts, since I think we really need less 
semantics rather than more.

Please send replies to dev@avalon.

Random Comments on Stefano's RT
===============================
Cocoon specific item: a sitemap. I don't know exactly what the sitemap 
does within cocoon, but IIUC it basically ties avalon-style components 
to various parts and tasks within a web application. It is sort of an 
xml description of the process-based MVC architecture cocoon implies. 
The whole concept of a sitemap is pretty much specific to a certain 
style of user interface application, and I believe in the current cocoon 
case pretty specific to web-based user interfaces. Not going to talk 
about it further.

An interesting comment from Stefano is that the servlet spec implies 
monolithic applications by design, since it stimulates seperation of 
wars. This is quite interesting because phoenix seperates its sar files 
(Server Application aRchive) more rigorously than the average servlet 
container its wars, yet the applications built on top of it (that I 
know) are rarely monolithic.

Rather, they are split into logical units (components, identified by a 
provided work interface and a passive nature), tied together using some 
kind of mechanism which is outside of the scope of phoenix. Common 
setups include making use of a JMX or JNDI registry or having the apps 
talk via (Alt)RMI.

Maybe the average phoenix user understands more about smart software 
architecture than the average servlet coder, or maybe it is basically an 
awareness issue where users follow existing practices. I doubt it would 
be very difficult to write a JNDI servlet which would serve as the 
central registry for a multitude of (cocoon) servlets. It is just not 
how webapps are built, and not what existing patterns and tools focus on.

Instance sharing above and beyond library sharing
-------------------------------------------------
Stefano also points out you almost always have to install multiple 
copies of jars: there is no way to install a single jar and use it 
multiple times (java has no functionality for .Net shared assemblies, so 
to speak). This is not strictly always true (most servlet engines 
provide a /common/lib), but it is the most common case for webapps to 
provide everything.

What he doesn't explicitly state is that besides sharing of common jars, 
you want to share common instances, or at least get your instances from 
a common pool. This is basically what he dubs "component-oriented 
deployment", IIUC:

  ---------------------------------------------------
  |    Inside the running application appserver     |
  ---------------------------------------------------
  | ----------------------                          |
  | | Processor Pool     |                          |
  | |                    |<-----------------\       |
  | |                    |                  |       |
  | ----------------------                  |       |
  |     ^           get [ProcessorInstance] |       |
  |     | get [ProcessorInstance]           |       |
  |     |                          ---------------  |
  | ---------------                | Application |  |
  | | Application |                ---------------  |
  | ---------------                                 |
  ---------------------------------------------------

in other words, there are various resources multiple applications might 
share, just like in daily life you often have multiple web applications 
talking to the same instance of a (say, mysql) database. In the above 
diagram (don't you love ascii ;), the Processor Pool might be replaced 
by the database (like hsqldb), and the ProcessorInstance might be 
replaced by a DataSource instance. Depends on the granularity.

This kind of setup is not implemented in phoenix (it is left up to the 
user to setup JNDI or something like that, and perform the get() against 
something fetched from JNDI).

A container which does enable this (and very elegantly, using the proven 
to be very useful altrmi), is EOB, 
http://www.enterpriseobjectbroker.org/. They (Paul Hammant being a prime 
mover) summarise it as "A bean server that uses .Net style remoting for 
Java". EOB runs on top of phoenix, btw.

IMO, this feature from should be implemented using several different 
mechanisms in all avalon-style containers. It rocks.

Seperation of interface/impl
----------------------------
Very, very important. GoF already knew that :D

The second major bullet point Stefano has is "polymorphic behavior", 
which in avalon-speak we call "seperation of interface and 
implementation", where you rigorously make sure you couple components 
only via work interfaces, and via nothing else. A simple example is that

----
import java.util.ArrayList;

class DefaultMyService
         implements Servicable
{
     ArrayList m_someList;

     service( ServiceManager sm )
     {
         m_someList = (ArrayList)sm.lookup(
             "java.util.ArrayList" );
     }
}
----
can be improved to
----
import java.util.List;

class DefaultMyService
         implements Servicable
{
     List m_someList;

     service( ServiceManager sm )
     {
         m_someList = (List)sm.lookup(
             "java.util.List" );
     }
}
----

this is a contrived example (a List is not a good example of a 
component), but the point is simple: by programming to the List 
interface, you make it easy to plug in (for example) the FastArrayList 
from commons-collections. And you make it easy to change to the use of a 
LinkedList if it turns out that's better for performance. Etc etc. This 
is a very general principle and it has nothing to do with a particular 
COP architecture. It is why interfaces exist in java :D

Inheritance already available in java
-------------------------------------
Stefano's bullet point number three is inheritance, where a block 
identified by a URL extends another block identified by some other URL. 
This is both potentially complicated to implement, and already 
implemented perfectly well with standard java inheritance. Inheritance 
has its use in COP, and I really don't understand why a different 
mechanism is neccessary. In code,

interface MyCustomService extends MyService {}
class DefaultMyCustomService implements MyCustomService {}

SARS/COBS and web locations
---------------------------
Stefano envisions specifying an URI for a cocoon block (.cob) which 
identifies where it can be located on the web. This seems similar to me 
to the <object/> tags in html and the references to the ActiveX controls 
those contain. This can be tremendously useful. In the phoenix world, 
you would add a new .sar via JMX, and then have phoenix figure out what 
the .sar has in terms of external dependencies. If it can't provide 
implementations for all of those, it can download the suggested 
implementation from the web. Again an idea already implemented in .net 
:D. Also reminds me of maven.

Taking this idea somewhat further, one could imagine some kind of 
customizable policy where the appserver might ignore the suggestion for 
an implementation, and instead talk with a central registry to figure 
out what component to download. Your "local repository", or perhaps a 
company-wide repository of COM or corba objects.

The easiest way to implement this is to make the avalon ROLE into a URN 
(urn being a superset of URI, being a superset of URL), ie attaching 
additional semantic meaning to what currently can be any unique string.

Versioning
----------
Stefano talks about what is basically versioning of implementation. Of 
course, you also want versioning of work interface. The level at which 
to implement versioning is the subject of debate. Java has the extension 
mechanism for doing it at the jar level (the recent releases of the 
excalibur components do this neatly for example, and merlin and phoenix 
make use of that. I think netbeans uses this as well). There's also the 
option to brute-force require a certain format for the name of the 
provided jar (ie ${id}-${version}.jar). OSGi does bundle versioning in a 
way similar to the extension mechanism IIRC.

Another option which has been tried (and discarded by some, ie Peter 
Donald :D) is the association of a version with a work interface rather 
than with the archive containing the work interface. .Net does 
versioning of the archive (the assembly), and has a pretty extensive and 
usable mechanism for it (basics at 
http://msdn.microsoft.com/library/en-us/cpguide/html/cpconassemblyversioning.asp). 
Most other stuff I know does this, too.


Possible Implementation (my own thoughts)
=========================================
I'll explore a possible implementation setup here (not optimal, 
probably, but possible). I'm going to use the name "merlin" because 
that's where we are most likely to want to experiment with this stuff.

Implementing Versioning
-----------------------
My opinion is that it normally makes sense to provide a version for a 
small group of components, not individual work interfaces, and that it 
makes sense to package such a component into a seperately distributed 
archive (jar). I also think it makes sense to use the java extension 
mechanism to do this.

My idea is that what we might want to do is provide a tool to parse all 
classes for something like

package com.my;

/**
  * @merlin.extension
  *     name="MyService"
  *     vendor="Apache Software Foundation"
  *     version="1.3.22"
  *     vendor-id="ASF"
  * @merlin.component
  *     vendor="Apache Software Foundation"
  *     version="1.1"
  *     vendor-id="ASF"
  */
class DefaultMyService implements MyService {}

and transform that into

Manifest-Version: 1.0
Created-By: Avalon-Merlin metadata parsing tool v1.0
Name: com/my/
Extension-Name: MyService
Specification-Vendor: Apache Software Foundation
Specification-Version: 1.3.22
Implementation-Vendor: Apache Software Foundation
Implementation-Version: 1.1
Implementation-Vendor-Id: ASF

to be added to the MANIFEST.MF. I don't know how much of that is already 
in place in the merlin meta tool, but I expect some of it at least. 
Should be pretty straightforward.

Implementing dependency resolution and autodownload
---------------------------------------------------
IMNHSO, the Class-Path mechanism used in the java extension mechanism is 
plain silly in its limited applicability as it works only with relative 
URLs. Since we are moving towards providing @avalon.dependency anyway 
with components, there should be plenty of info there which should make 
it possible to combine a simple resolve.properties with available 
metadata so that autodownload can be facilitated. IOW, when you already have

/**
  * @avalon.dependency
  *     type="MyPool"
  */
public void service( ServiceManager sm );

combining that with

# ~/.merlin/resolve.properties
componany.repository=http://maven.my.com/
com.my.MyPool=${company.repository}/mypool/jars/my-pool-3.1.jar

is the minimum that would allow autodownload & install. Of course, 
instead of resolve.properties one could use an xml file, the manifest 
file, or yet more attributes parsed into one of those. Something like

/**
  * @avalon.dependency
  *     type="MyPool"
  * @merlin.dependency-info
  *     type="MyPool"
  *     version="3.1"
  *     default-impl-jar-location =
  *       "http://maven.my.com/mypool/jars/my-pool-3.1.jar"
  *     optional=true
  * @avalon.dependency
  *     type="MyCLITool"
  * @merlin.dependency-info
  *     type="MyCLITool"
  *     version="1.0"
  *     default-jar-location =
  *       "http://maven.my.com/mycli/jars/my-clitool-1.0.jar"
  */
public void service( ServiceManager sm ); /* ... */

might be parsed into

Merlin-Dependency-Name: com/my/MyPool
Merlin-Dependency-Version: 3.1
Merlin-Dependency-Implementation-Location:
    http://maven.my.com/mypool/jars/my-pool-3.1.jar
Merlin-Dependency-Optional: true
Merlin-Dependency-Name: com/my/cli/MyCLITool
Merlin-Dependency-Version: 1.0
Merlin-Dependency-Implementation-Location:
    http://maven.my.com/mycli/jars/my-clitool-1.0.jar
Merlin-Dependency-Optional: false

which could be parsed by the container at runtime. On drag-and-drop of 
my-service-1.0.sar into the merlin apps/ dir, the assembly package might 
scan the manifest file for Merlin-Dependency-Name, and try and find an 
implementation package for com/my/MyPool. If not found, it could 
autodownload the specified jar, verify the dependencies of that package 
are satisfied, until all deps are satisfied.

When all jars are available and on the classpath, the 
avalon-framework-specific part of merlin could kick in and determine 
what services to instantiate (ie do all the stuff it already does).

Now your comment is that the versioning metadata is applied to a single 
component and not a set of components, where I said versioning at the 
jar level is needed. Which is true. However, doing things this way I 
think will actually reduce duplication (we already need the dependency 
and service declaration on a per-component basis!). It also reduces the 
number of files which need to be maintained. For multiple components in 
the same jar, one could simply do an

package com.my;

/**
  * @merlin.dependency-reference type="DefaultMyService"
  */
class SomeOtherDefaultServiceinSameJarAsDefaultMyService
     implements ObjectPool {}

or even just

/**
  * You should take a look at {@link DefaultMyService#service} for
  * the merlin- and avalon-related dependency metadata which should
  * be applied to this service as well when doing auto-resolution of
  * {@link http://avalon.apache.org/sandbox/merlin/assembly jar-level
  * dependencies}.
  */
class SomeOtherDefaultServiceinSameJarAsDefaultMyService
     implements ObjectPool {}

which would allow moving to the right piece of info on required 
extensions for this class by a single click in IntelliJ IDEA.

Nevertheless, the metadata parser could implement a best-guess but fail 
early mechanism where components in a single jar pointing to other 
implementation jars containing conflicting versions of the same 
component results in an error. As a first step that is; doing component 
isolation and fancy classloading like available in .Net is of course 
possible.

I don't see technical impossibilities at all.

More on per-component vs per-jar dependency resolution
------------------------------------------------------
Once again, my current thinking is that you specify which work interface 
implementations a component requires (to be provided in a 
ServiceManager) using the AMTAGS @avalon.dependency tag, then add 
information on versioning and an associated java extension mechanism jar 
using @merlin.dependency-info to enable autodownload. This provides a 
coupling between component dependencies and jar dependencies. The idea 
is that the jar dependencies of all components that go into a single jar 
are merged together during build time, and that conflicts during the 
merge result in an error.

This gets you all the benefits of using meta tags (easy enough to 
understand, works well with existing editors, tools available for 
parsing them, reduce the number of source files that need to be 
maintained, etc), while not dragging us down into needing to do real 
complex (potentially impossible) classloading or into the (very 
unpractical) need to provide a single jar per single component 
implementation. It also means no dependency info duplication.

Compatible with cocoon blocks needs?
------------------------------------
Instead of opting for a central repository, I'm opting for the easier to 
implement darwinism where the option is left open but the first version 
of an implementation doesn't need to figure out which files it needs to 
download, because the developer specifies @merlin.dependency-info 
default-jar-location = "http://maven.my.com/mycli/jars/my-clitool-1.0.jar"

Other than that, I think the usecase is addressed. Additional comments 
on what stuff doesn't address a use case below ;)

Instance sharing
----------------
AltRMI rocks for that. I need to think more on generalizing the EOB 
semantics for implementation inside merlin. This is a nearly totally 
seperate concern, to be implemented (in a container) after dependencies 
have already been downloaded, verified, and classloaded.

Low on semantics, low on design! XP!
------------------------------------
The attribute-driven tag-based approach is tried-and-tested and many 
developers know how to work with it. Some tools (xdoclet, qdox, 
MetaGenerateTask) are already available for generating various kinds of 
files from those tags. By reusing the jar MANIFEST.MF file and the 
extension mechanism (extending on it a little to specify dependencies at 
runtime), there is no need for a custom archive format like .sar or 
.cob. In fact, the entire concept of a "block" as distinct from "some 
kind of aggregation of some components in a jar with a manifest file" is 
simply not (formally or strictly) needed.

I also removed the concept of a "behaviour URI" from cocoon blocks, as 
behaviour is already specified by a work interface, and a work interface 
is already identified by a role.

Finally, I removed the concept of "block inheritance". It might make 
sense in the cocoon context, but in general I can't see what it does 
that java inheritance doesn't.

This thing still does support "optional COP", and is fully 
backwards-compatible with any and all software I can think of. A 
container which doesn't support auto-assembly simply ignores the extra 
entries in the manifest, a metadata parser tool simply ignores the 
@merlin.<blah> stuff. The idea is also to reuse all existing 
infrastructure for this stuff.

Also note the attribute setup is completely optional. What it all boils 
  down to is that having a few extra lines like

Manifest-Version: 1.0
Created-By: Avalon-Merlin metadata parsing tool v1.0

Name: com/my/service
Extension-Name: MyService
Specification-Vendor: Apache Software Foundation
Specification-Version: 1.3.22
Implementation-Vendor: Apache Software Foundation
Implementation-Version: 1.1
Implementation-Vendor-Id: ASF

Merlin-Dependency-Name: com/my/pool/MyPool
Merlin-Dependency-Version: 3.1
Merlin-Dependency-Implementation-Location:
    http://maven.my.com/mypool/jars/my-pool-3.1.jar
Merlin-Dependency-Optional: true

Merlin-Dependency-Name: com/my/cli/MyCLITool
Merlin-Dependency-Version: 1.0
Merlin-Dependency-Implementation-Location:
    http://www.my.com/dist/my/cli/my-clitool-1.0.jar
Merlin-Dependency-Optional: false

in my-service-1.3.22.jar!/META-INF/MANIFEST.MF (something which is 
doable by hand, or probably relatively easily generated from a slightly 
modified maven POM using a few lines of jelly) allows automatic 
resolution of dependencies, and addresses half of the cocoon blocks 
requirements without needing additional semantics. The other half can be 
addressed using the EOB approach.

Low on ideas from the rest of the world
---------------------------------------
Haven't taken a detailed look at JBoss, OSGi, EJBs, netbeans, eclipse, 
any of them. For the most part constrained my view to the existing 
avalon world. And I haven't read up or followed all of the prior cocoon 
blocks discussions at all. I think most of the problem has already been 
solved in various places. Ignorance is fun; one has the illusion of 
having an original random thought ;)

g'night,

- LSD



Mime
View raw message