forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Smith <>
Subject Re: [Proposal] add DTDs to Apache website
Date Wed, 14 Jan 2004 12:09:15 GMT
 --- Juan Jose Pablos <> wrote: > Paul,
> I think that the idea is this one:
> Source 
>                 Intermediate	  output
> document|faq|howto -----> xhtml -----> html|pdf|wml|xhtml 1.0
> wiki
> xhtml ------------------> xhtml -----> html|pdf|wml|xhtml 1.0
> Please correct me if I am wrong...

That's exactly how I read it, so to ensure that valid
document|faq|howto documents are put in at the source end, we still
need document|faq|howto, DTDs, no?  What I'm saying is, if the source
format happens to be XML, we need the DTD for it, to ensure that a
valid document is presented to be transformed into xhtml.

If a valid document is at the source end, then the intermediate xhtml
should certainly be valid (unless the transform is broken) thus you'd
only need to validate at the intermediate stage as a sanity-check, i.e.
something only developers should really have to do.  Then, once we have
valid xhmtl in the intermedite stage, it can be transformed into

The point of having an intermediate format, if I'm correct, is that say
there are X supported source formats and Y supported output formats,
you would need (X * Y) transforms to cover every possibility of
transforming from each of the X sources to each of the Y outputs.

By having an intermediate format, you need only (X + Y) transforms, X
transforms into xhtml, and Y transforms back out again.

In the example above, with an intermediate format you need the
following transformations

document -> xhtml
faq -> xhtml
howto -> xhtml
wiki -> xhtml
xhtml -> xhtml (Not sure about this one :) )
xhtml -> html
xhtml -> pdf
xhtml -> wml
xhtml -> xhtml+css

For a total of 9 transforms.  However, without the intermediate format,
you would instead need:

document -> html
document -> pdf
document -> wml
document -> xhtml+css
faq -> html
faq -> pdf
faq -> wml
faq -> xhtml+css
howto -> html
howto -> pdf
howto -> wml
howto -> xhtml+css
wiki -> html
wiki -> pdf
wiki -> wml
wiki -> xhtml+css
xhtml -> html
xhtml -> pdf
xhtml -> wml
xhtml -> xhtml+css

For a total of 20 transforms.  Also, by having an intermediate format,
people can write output transforms that the input people might not have
thought about.  Take for example the work needed to add a new input
document doctype, say diary.

With no intermediate format, you have to write

diary -> html
diary -> pdf

Then later, somebody realises that VoiceML should be available as an
output, somebody then needs to write

diary -> VoiceML.

Noting that the output transform people might not know about the diary
doctype, and the person writing the diary doctype might know nothing
about XSLT!

However, with an intermediate format, you only need to write

diary -> xhtml

then when VoiceML gets added to the output list, it will automatically
be applicable to the diary doctype, with no communication between
inputters and outputters necessary.

If I've laid out the case for having an intermediate format correctly,
then there are 1 or 2 consequences of this.

1) There must only be 1 intermediate format.  More formats means
proportionally more output tranforms need to be written.
2) The intermediate file format must be able to represent any semantic
information (that could affect it's output*) present in the source
document (source->intermediate shouldn't be a 'lossy' transform)
3) The intermediate file format must have no knowledge of what source
document was used - the type of source document is used for the
source->intermediate transform, but once in intermediate form, the
output transforms must have a common base to work from.

* By this I mean that you might have an attribute for version number in
your source document, this does not need to be passed through to the
intermediate document, as it should be handled in the
source->intermediate transform

So, again, if I've got this all right, we need DTDs for all "Supported
by Forrest" source document types, and I sincerely hope that the
document-v?? format remains supported by forrest.  You only need a DTD
for the intermediate format to sanity-check your source->intermediate

In this resepct, I think XHTML is a perfect candidate for intermediate
file format.

PS I apologise if everyone thought this thread had been closed :)
PPS I also apologise for this being quite a long post, I needed to put
down everything I thought about onto paper so I can get it validated by
you guys :)
PPPS I'm using Forrest to revive some old tutorials to do with game
programming that I wrote, the simplicity of writing in the document.dtd
format and just hitting "forrest site" is fantastic - keep up the good work!

Paul Smith
Postgraduate Student
Department of Mathematics
School of Engineering, Computer Science,
                            and Mathematics
University of Exeter

Yahoo! Messenger - Communicate instantly..."Ping" 
your friends today! Download Messenger Now

View raw message