forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Dixon <>
Subject Re: xinclude
Date Mon, 11 Sep 2006 07:31:01 GMT
On Fri, 8 Sep 2006, Ross Gardler wrote:

> > I've thought about this a bit more.  One of the problems here is that
> > adding xi:include elements has unexpected results.
> >
> > If the DTD is extended as above, then the validator will, I think, not
> > check beyond the xi:include element, and so a document may validate
> > even though what is being XIncluded is nonsense.  I can write
> >   <p><xi:include href="rubbish.xml"/></p>
> > and validation will succeed, because the xi:include element has the
> > pattern required by the DTD even though rubbish.xml isn't XML at all
> Good point.
> > The expected behavior is that the validator recognizes that what is being
> > XIncluded is XML (as it is by default) and goes through to validate that
> > as well, silently replacing the xi:include element with whatever is
> > XIncluded.  I think that some parsers do this - perhaps only if an
> > option is set - but most don't.
> Does Xalan do it? This is the default parser for Forrest. A healthy

Uhm, do you mean Xerces?  From what I can see Xalan is unaware of

> warning in the docs and output of the validate task may be sufficient
> for those using a different parser.

I have only skimmed the Forrest build files but Xerces must be handling
XInclusion, because after all Forrest works.  If I XInclude foo.txt in
a document, then its contents appear on the page fetched by my browser.

> > A better approach would be to process the XIncludes before validation,
> > stripping off the xlmns:xi attribute from the document element and
> > replacing xi:includes with whatever they resolve to.  This should be
> > cheaper than it might seem: unless the xmlns:xi is present, the
> > document is simply handed on to the validator untouched.
> I can't see an easy way of doing this as, in many cases, the included
> content is generated by Forrest. In fact, this would be a problem if the
> parser were doing the includes.

I am baffled.  How would it be a problem if the parser was doing the

Some people build XML documents by writing chapters or sections separately
and then XIncluding them into one master document.  That is, the top-level
document consists of a preamble followed by a series of xi:includes.
This is quite a sensible approach in many circumstances.  But if you do
this in Forrest using a DTD modified along the lines that I have taken, it
means that you can't actually validate the document, because the
validating parser will just check that the xi:includes in the top-level
document are permitted, find that they are, and then go on, ignoring the
contents of the chapters/ sections.

To make this particular example (XIncluded sections) work, you would have
added xi:include to local.sections in the DTD.  However, anyone new to
Forrest but familiar with XInclude will expect to be able to use
xi:include in many places.

You can handle this with complex DTDs or by writing XSLT scripts to
replace the xi:includes with what they represent.  But this is perverse.
Think C: you don't change the grammar of C to explicitly recognize
#includes; you have a preprocessor that handles the inclusion and then
you parse what comes out of the preprocessor.

This is exactly how XIncludes should be handled: you make a pass that
dereferences the xi:includes, then you validate the output XML against
the DTD (one with no xi:includes in it).

Jim Dixon   tel +44 117 982 0786  mobile +44 797 373 7881

View raw message