Jim Dixon wrote: > On Fri, 8 Sep 2006, Ross Gardler wrote: > > >>>I've thought about this a bit more. One of the problems here is that >>>adding xi:include elements has unexpected results. >>> >>>If the DTD is extended as above, then the validator will, I think, not >>>check beyond the xi:include element, and so a document may validate >>>even though what is being XIncluded is nonsense. I can write >>>

>>>and validation will succeed, because the xi:include element has the >>>pattern required by the DTD even though rubbish.xml isn't XML at all >> >>Good point. >> >> >>>The expected behavior is that the validator recognizes that what is being >>>XIncluded is XML (as it is by default) and goes through to validate that >>>as well, silently replacing the xi:include element with whatever is >>>XIncluded. I think that some parsers do this - perhaps only if an >>>option is set - but most don't. >> >>Does Xalan do it? This is the default parser for Forrest. A healthy > > > Uhm, do you mean Xerces? From what I can see Xalan is unaware of > XIncludes. Yes, I often get Xerces and Xalan names mixed up, sorry. >>>A better approach would be to process the XIncludes before validation, >>>stripping off the xlmns:xi attribute from the document element and >>>replacing xi:includes with whatever they resolve to. This should be >>>cheaper than it might seem: unless the xmlns:xi is present, the >>>document is simply handed on to the validator untouched. >> >>I can't see an easy way of doing this as, in many cases, the included >>content is generated by Forrest. In fact, this would be a problem if the >>parser were doing the includes. > > > I am baffled. How would it be a problem if the parser was doing the > XIncludes? David points out in another message that the validate-xdocs is done prior to Forrest doing any transformations on content, it only validates the *source* documents. This means that if a source document XIncludes another source document that is available statically on disk/network, as in your use case, then the above will work OK. However, if a source document includes source content that is dynamically generated, for example, pulled from a database/RSS Feed/Jira instance etc. then we would have to fire up Forrest to generate these sources. If we are validating source documents before we fire up Forrest we end up in a catch 22. One solution would be to fire up a running instance of Forrest (aka forrest run) and have Xerces validate the xincludes by retrieving them from the running instance of Forrest. But this really is clumsy and I would guess non-trivial. My point is, any solution that is created to better support the first use case (including static content) must also work in the second use case (including dynamic content). Ross