forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gav...." <brightoncomput...@brightontown.com.au>
Subject RE: [RT] A new Forrest implementation?
Date Fri, 18 Aug 2006 10:52:08 GMT
At the start of this email I'd like to express my thoughts in my last
comments regarding swinging back towards keeping Cocoon have not changed
altogether. I like what follows here, but I also hope there is a place to be
able to plugin some of Cocoons Blocks, forms etc.

> -----Original Message-----
> From: Ross Gardler [mailto:rgardler@apache.org]
> Sent: Friday, 18 August 2006 9:17 AM
> To: dev@forrest.apache.org
> Subject: Re: [RT] A new Forrest implementation?
> 
> Ross Gardler wrote:
> > This is a Random Thought. The ideas contained within are not fully
> > developed and are bound to have lots of holes. The idea is to promote
> > healthy discussion, so please, everyone, dive in and discuss.
> 
> In order to better support my position in this RT I've been
> experimenting with alternative implementations.
> 
> I now have a working (although very hacky) version of a new Forrest
> Core. It is *very* basic right now so don't get too excited, I'm only
> trying to feed the flames of this discussion. The deployed webapp
> version is 960kb, this includes the test code and sample documents. Add
> the size of Xalan and Xerces for the CLI version.
> 
> Clearly this will grow as we add some of the missing features (see
> below). Spring, ehcache and an RE processor are probably the largest
> additional dependencies we need in core and they weighs in at a few
> hundred Kb each (I think).
> 
> In other words, it looks like I can deliver in just a few megabytes.

This sounds really good.

> 
> What is does have:
> ------------------
> 
> - XHTML2 as internal format

Cool, does it work good?

> 
> - Locationmap support
> 
> - plugin architecture
> 
> - XSLT transformations
> 
> - CLI interface (very basic no link following)
> 
> - Webapp interface
> 
> - File and HTTP readers
> 
> 
> What it doesn't have:
> ---------------------
> 
> - Container managed components
> 
> - pattern matching in the Locationmap or in the output plugin selection
> 
> - handling of aggregated documents - they work on the input side, but
> I'm still considering how best to handle them on the output side.
> 
> - external config files (i.e. the locationmap and available plugins are
> currently hard coded data structures)
> 
> - image (and other binary files) handling
> 
> - cacheing
> 
> - optimisation (i.e. no SAX stream between the individual components)
> 
> - adequate demos (a couple of Hello World input plugins and Gavs XHTML2
> sample document only)
> 
> - loads of stuff I haven't thought of yet
> 
> 
> Design
> ------
> 
> It's really simple (honest), the processing goes like this:
> 
> request URI (to controller)
>    -> source documents(s)  (from readers)
>      -> internal document(s)  (from input plugins)
>        -> output document        (from output plugins)
> 
> The main componets are:
> 
> Controller
> ----------
> This is the interface point between the application (CLI,  webapp, or
> JUnit tests so far). To use it you do somethin like:
> 
> requestURI = new URI(TestController.TEST_REQUEST_URI);
> Controller controller = new Controller();
> AbstractOutputDocument doc = controller.getOutputDocument(requestURI);
> out.println(doc.getContentAsString());
> 
> LocationMap
> -----------
> A simple lookup table mapping the requestURI to the required source
> document(s) - it supports optional files and aggregation.
> 
> A locationmap is built as follows (remember this should be read from a
> config file):
> 
> URI requestURI = new URI(TestController.TEST_REQUEST_URI);
> location = new Location(requestURI, this.getClass().getResource(
> 	TestController.SOURCE_DOCUMENT_XHTML2_COMPLETE), true);
> locationMap.put(requestURI, location);
> location = new Location(requestURI, this.getClass().getResource(
>          TestController.SOURCE_DOCUMENT_XHTML2_SIMPLE), true);
> locationMap.put(requestURI, location);
> 
> Then you get the locations(s) with:
> 
> List<Location> locations = locationMap.get(requestURI);
> 
> 
> ReaderFactory
> -------------
> Given the URL of a source document this factory returns the correct
> reader for the document. For example "http://foo.com" will return an
> HTTPReader whilst "file://bar" will return a file reader.
> 
> Reader
> -------
> Reads a source document and infers the type of document it is (XML,
> image etc. although only XML is supported right now). This returns a
> typed document class representing the document by using a DocumentFactory.
> 
> DocumentFactory
> ---------------
> This is perhaps the most complicated part of the system. It is roughly
> equivalent to the source resolver, that is, it infers the type of
> document we are working with. Once it knows the type of document it can
> provide a typed document object.
> 
> If we have a mime-type that gives us enough information it will use that
> (i.e. an OOo document). If not it will try looking ahead into the
> contents of the file until it has enough info. For example:
> 
> while ((numRead = reader.read(buf)) != -1 && mimeType == null) {
>    String readData = String.valueOf(buf, 0, numRead);
>    fileData.append(readData);
>    buf = new char[1024];
>    if (fileData.toString().contains("<?xml")) {
>          String type = getXMLDocumentType(fileData.toString());
> 	doc = new XMLSourceDocument(fileData.toString(), reader);
>    }
> }
> 

Not totally foolproof this, that declaration could be in .html files, .php
files could possibly contain no PHP whatsoever (useless waste but I've see
it). Would have to determine if the string being searched and matched is a
doctype declaration and not a code example snippet, mind you , you are
restricting the search to the first few lines of the document so this should
not be an issue. Just mentioning this to cater for any mistakes etc.


> InputPluginFactory
> ------------------
> Given a typed source document this factory provides an InputPlugin to
> convert from the source document to an internal XHTML2 document.
> 
> OutputPluginFactory
> -------------------
> Given one or more internal documents and a request URI this factory
> provides an OutputPlugin for producing the final document as requested.
> 
> What Next?
> ==========
> 
> - make the locationmap load from an XML file
> 
> - use a component manager and configure plugins from the containers
> config files (rather than building the config in code)
> 
> - handle aggregation on the output side (I'd like to discuss how to
> handle this if anyone is interested in helping out, but we must be aware
> that this is still an RT and we are only exploring ideas not committing
> the community to anything)
> 
> - create a documentation/test site
> 
> - build a number of input and output plugins for testing, such as:
> 	- OOo Input
>          - RDBMS Input
> 	- PDF Output
> 	- RTF Output
> 
> - experiment with cocoon integration as a plugin (I *really* like this
> idea it seems to cover most of the drawbacks Tim has identified in my
> ideas, although I'd like to hear what Tim thinks about this)

I do too, use cocoon parts for what we need it for, other parts being
available to anyone else who wants it too. If cocoon is doing something well
that we need, I would not want to see it rewritten for the sake of getting
rid of as much of cocoon as possible, but if a better, smaller, cleaner way
is possible then that is good too.

> 
> - decide if we should continue experimenting along this line.

Sounds good to me, I'd like to see what you've got so far, so I can play and
judge better. Can you put in in whiteboard, or would a branch be
needed/preferred, either way is good for me.

Gav...

> 
> Ross
> 
> 
> --
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.1.405 / Virus Database: 268.11.2/422 - Release Date: 8/17/2006



Mime
View raw message