Jim Dixon wrote:
> On Fri, 8 Sep 2006, Ross Gardler wrote:
>
>
>>>I've thought about this a bit more. One of the problems here is that
>>>adding xi:include elements has unexpected results.
>>>
>>>If the DTD is extended as above, then the validator will, I think, not
>>>check beyond the xi:include element, and so a document may validate
>>>even though what is being XIncluded is nonsense. I can write
>>>
>>>and validation will succeed, because the xi:include element has the
>>>pattern required by the DTD even though rubbish.xml isn't XML at all
>>
>>Good point.
>>
>>
>>>The expected behavior is that the validator recognizes that what is being
>>>XIncluded is XML (as it is by default) and goes through to validate that
>>>as well, silently replacing the xi:include element with whatever is
>>>XIncluded. I think that some parsers do this - perhaps only if an
>>>option is set - but most don't.
>>
>>Does Xalan do it? This is the default parser for Forrest. A healthy
>
>
> Uhm, do you mean Xerces? From what I can see Xalan is unaware of
> XIncludes.
Yes, I often get Xerces and Xalan names mixed up, sorry.
>>>A better approach would be to process the XIncludes before validation,
>>>stripping off the xlmns:xi attribute from the document element and
>>>replacing xi:includes with whatever they resolve to. This should be
>>>cheaper than it might seem: unless the xmlns:xi is present, the
>>>document is simply handed on to the validator untouched.
>>
>>I can't see an easy way of doing this as, in many cases, the included
>>content is generated by Forrest. In fact, this would be a problem if the
>>parser were doing the includes.
>
>
> I am baffled. How would it be a problem if the parser was doing the
> XIncludes?
David points out in another message that the validate-xdocs is done
prior to Forrest doing any transformations on content, it only validates
the *source* documents.
This means that if a source document XIncludes another source document
that is available statically on disk/network, as in your use case, then
the above will work OK.
However, if a source document includes source content that is
dynamically generated, for example, pulled from a database/RSS Feed/Jira
instance etc. then we would have to fire up Forrest to generate these
sources. If we are validating source documents before we fire up Forrest
we end up in a catch 22.
One solution would be to fire up a running instance of Forrest (aka
forrest run) and have Xerces validate the xincludes by retrieving them
from the running instance of Forrest. But this really is clumsy and I
would guess non-trivial.
My point is, any solution that is created to better support the first
use case (including static content) must also work in the second use
case (including dynamic content).
Ross