forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ross Gardler <rgard...@apache.org>
Subject What does "XHTML2 as an internal document format" mean?
Date Wed, 14 Sep 2005 23:56:20 GMT
In preparation for our upcoming IRC session on the topic of converting 
Forrest to use a subset of XHTML2 as its internal document format. There 
appear to be at least two, if not three (or even more) opinions on this. 
The purpose of this thread is not (at least initially) to debate each 
opinion, but instead to provide background information to feed into the 
IRC session.

If you have a suggestion for an approach then please add it to this 
thread. However, please avoid commenting on other proposals that have 
gone before (other than to say "as described by..." in cases of agreement).

The idea is for this to be an initial brainstorming thread *not* a 
discussion or planning thread. We'll do that later, lets just absorb one 
anothers ideas so we can extract the best of them all via IRC 
discussion. We can then come back to this thread and wrap up with our 
conclusions.

--------

Here's an outline of my approach:

--------

Assumptions
===========

First of all I assume that there is no point in working on anything to 
do with the old skinning system. It is going to be removed in favour of 
views and I don't want to have to refactor things twice.

I am using forrest:views to define the various technologies that, 
together, provide the new skinning system. That is those items defined 
in [1]


Defining the Core Pipeline
==========================

The pipeline when using views is discussed in [1] where we define the 
pipeline to be either:

                                            theme
                                              |
                                             \|/
src -> input plugin -> core (views) -> output plugin -> output
                         |        /|\
                        \|/        |
                     forrest:contracts

As defined in [5] or:


                                            theme
                                              |
                                             \|/
src -> input plugin -> core (views) -> output plugin -> output
                        /|\      |          |
                         |      \|/        \|/
                         +------------------+
                         |forrest:contracts |
                         |forrest:properties|
                         +------------------+


This later pipeline was suggested because "the contracts as viewHelper 
should come *from* the plugin" [2] (actually I reversed the last arrow 
from the original post because of this description)

[It should be noted that since these mails were written we have agreed 
to rename the part of forrest:views shown here in core as "structurer", 
I will use the term structurer in the rest of this mail]

Both of the above are aligned with our TR document [4] which defines the 
stages along the central pipeline as:

Resolver -> Xifier -> Filter -> Windower -> Themer -> Serializer

Cool, lots of agreement there :-)

Fitting Forrest:Views into the Pipeline
=======================================

So, we seem to be in agreement on the core pipeline. However, there are 
actually two opinions on how views fit in. I am going to really rock the 
boat and add a third (even though one of the above is mine ;-)

Why do we need a third? Lets start off by looking at the definitions of 
the various parts of this pipeline:

Structurer
----------

The structurer part of a view is defined as adding "a structure to the 
generated page that clearly identifies all the content in the final 
output" [6] and [7], and further as "The structuring of the assembled 
page where all content is in place and structured with forrest:hooks to 
provide hooks for theming." [8]

OK, so it is pretty clear that the *.fv files are part of the 
structurer. And these belong in core, that is the language used is 
defined by Forrest core itself. It is an internal format. Note this 
means we can use, for example, the Cocoon Portal page layout language as 
an input format for the structurer, or we can generate it as an output 
from the structurer.

Note that the structurer does *not* define any content. Therefore core 
should *not* have any knowledge of content

Forrest Contracts
-----------------

Forrest:contracts are defined as "the templated content that should be 
inserted into the final document. These may create a new request in 
order to generate the content" [5] and as "Helpers (forrest:contracts) 
mainly adapt and transform the presentation model (pm) for the view, but 
also help with any limited business processing that is initiated from 
the view (forrest:properties)" [8]

So contracts describe how to retrieve/extracts bits of content (or 
nuggets) to be inserted into the final document at locations defined in 
the *.fv files (for the structurer).

Output Plugins
--------------

An output plugin is defined as providing "a new output format. For 
example, the s5 plugin extends Forrest to produce HTML slides from 
Forrest documents." [3]

So an output plugin provides a version of a document that can be 
rendered, for example, HTML or FO. It may also provide a theme to 
describe how this should be displayed in the final rendering, e.g. CSS 
(FO has no separate theme, but the plugin may provide configuration info 
for the generated FO).

In my view there is nothing in this definition that describes *content* 
and since forrest:contracts are about content they have no place in 
output plugins.

However, they do have a place in input input plugins since they *do* 
define content. Some examples can be found in my recent work on the 
Resume plugin where I have defined contracts to insert the various 
portions of a resume into documents.

Finally, they fit!
------------------

So given the definitions/opinions above, I think the processing 
pipeline, with views plugged in is:

                                            theme
                                              |
                                             \|/
src -> input plugin -> core (views) -> output plugin -> output
  |          |              /|\                           /|\
  |          |               |                             |
  |          |         \ +------------------+              |
  |          +---------- |forrest:contracts |              |
  |                    / |forrest:properties|              |
  |                      +------------------+              |
  |                                                        |
  |                                                        |
  +--------------------------------------------------------+

Notice that *all* of our contracts are coming from input plugins. Why is 
this? The answer will come clear in the next section (I hope).

XHTML2 in Core
==============

So finally we come to the point. What does it mean for XHTML2 to be our 
internal document format? First (not quite there yet) lets consider why 
we have an internal format:

We want to convert many source formats into many output formats. We want 
to do this with minimal effort. So we adopt an internal format and write 
a series of output plugins to give us the different formats from that 
single internal format. Now we write a load of input plugins to convert 
the source formats into our internal format and viola, we have many to 
many conversion.

So, everything coming *in* to our core must be our internal format, and 
everything coming *out* must be our internal format. There should be 
*nothing* inside core fo any other format.

An Example Input Plugin
-----------------------

It is the job of our input plugins to provide the internal format. 
Consder a OpenOffice input plugin, it converts the OOo XML format to our 
internal format. What forrest:contracts does it provide?

An OOo document consists of meta-data, content (made up of pages, 
sections, paragraphs) and style information. So logical contracts would 
be various meta-data contracts (authors, statistics, abstract, 
keywords), content (all, page X etc.) and style (produces CSS). This way 
a user can decide which parts of the original document are used.

An Example Output Plugin
------------------------

It is the job of our output plugins to consume the internal format and 
produce our output format. So they take a *fully structured* document 
and convert it into the chosen output. Lets consider an HTML output 
plugin. What does it provide?

It provides a single XSL that converts XHTML2 to HTML. It may also 
provide an XSL to convert an internal style language into CSS (we 
currently do not have an internal style language, so lets not go there 
just yet, just planting a meme).

What about a PDF output plugin? It provides a single XSL to convert from 
XHTML2 to FO.

Concluding Where XHTML2 Fits
----------------------------

It fits in the forrest:contracts and in the internal processing within 
core (structurer).

How do we Implement it?
=======================

Lets first consider what we have (in the XHTML2 plugin since this is the 
approach I am outlining here):

- we have an XHTML2 based site

- we have the start of the XHTML to HTML stylesheet that will be the 
major part of the HTML output plugin

- we have some templates converted to use XHTML2 - these will form the 
start of an XHTML2 input plugin

- we have a structurer sitemap that is basically the two existing views 
plugins thrown together

Combined these elements will provide the content elements of a page. 
They do not currently work with navigation etc. since the aggregation of 
navigation has been removed since it belongs in the contracts not in the 
  structurer (as discussed above).


Roadmap
-------

Now what do we need to do?

- enable the navigation contracts

- convert all contracts to XHTML2

- break out the HTML output plugin

- add theming support

- break out the XHTML2 input plugin

- refactor (or rewrite?) the structurer sitemap (with locationmap in mind)

The Future
==========

This last step (refactor structurer sitemaps) is really part of a larger 
effort that to addess the first stage of our pipeline as defined above. 
That is the resolving of the source file.

I'll leave that for a whole new Forrest Tuesday.

References
==========

[1] http://marc.theaimsgroup.com/?t=112276643700001&r=1&w=2

[2] http://marc.theaimsgroup.com/?l=forrest-dev&m=112596689428172&w=2

[3] 
http://forrest.apache.org/pluginDocs/plugins_0_80/pluginInfrastructure.html#outputPlugins

[4] 
http://svn.apache.org/viewcvs.cgi/*checkout*/forrest/trunk/site-author/content/xdocs/TR/2005/WD-forrest10.html

[5] http://marc.theaimsgroup.com/?l=forrest-dev&m=112276632331269&w=2

[6] http://marc.theaimsgroup.com/?l=forrest-dev&m=112277657832032&w=2

[7] http://marc.theaimsgroup.com/?l=forrest-dev&m=112438965225785&w=2

[8] http://marc.theaimsgroup.com/?l=forrest-dev&m=112596689428172&w=2

Mime
View raw message