Return-Path: Delivered-To: apmail-forrest-dev-archive@www.apache.org Received: (qmail 23738 invoked from network); 11 Sep 2006 09:22:43 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 11 Sep 2006 09:22:43 -0000 Received: (qmail 38041 invoked by uid 500); 11 Sep 2006 09:22:37 -0000 Delivered-To: apmail-forrest-dev-archive@forrest.apache.org Received: (qmail 37975 invoked by uid 500); 11 Sep 2006 09:22:37 -0000 Mailing-List: contact dev-help@forrest.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: Reply-To: dev@forrest.apache.org List-Id: Delivered-To: mailing list dev@forrest.apache.org Received: (qmail 37920 invoked by uid 99); 11 Sep 2006 09:22:37 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 11 Sep 2006 02:22:37 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: neutral (asf.osuosl.org: local policy) Received: from [212.23.3.140] (HELO pythagoras.zen.co.uk) (212.23.3.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 11 Sep 2006 02:22:35 -0700 Received: from [217.155.85.142] (helo=[217.155.85.137]) by pythagoras.zen.co.uk with esmtp (Exim 4.50) id 1GMhza-0000Sa-Gk for dev@forrest.apache.org; Mon, 11 Sep 2006 09:22:14 +0000 Message-ID: <45052AC1.6020906@apache.org> Date: Mon, 11 Sep 2006 10:22:09 +0100 From: Ross Gardler User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: dev@forrest.apache.org Subject: Re: xinclude References: <6212b1d00606290824p5557e258q7d878c14d62cd89f@mail.gmail.com> <1151595390.8177.39.camel@localhost.localdomain> <44FB3B5C.80808@apache.org> <44FBE573.2010609@apache.org> <1157552352.8095.12.camel@localhost.localdomain> <44FEEBDA.8030305@apache.org> <45013DD7.4070909@apache.org> <45015229.9070400@apache.org> <450525A6.3080304@apache.org> In-Reply-To: <450525A6.3080304@apache.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Originating-Pythagoras-IP: [217.155.85.142] X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Ross Gardler wrote: > Jim Dixon wrote: > >> On Fri, 8 Sep 2006, Ross Gardler wrote: >> >> >>>> I've thought about this a bit more. One of the problems here is that >>>> adding xi:include elements has unexpected results. >>>> >>>> If the DTD is extended as above, then the validator will, I think, not >>>> check beyond the xi:include element, and so a document may validate >>>> even though what is being XIncluded is nonsense. I can write >>>>

>>>> and validation will succeed, because the xi:include element has the >>>> pattern required by the DTD even though rubbish.xml isn't XML at all >>> >>> >>> Good point. >>> >>> >>>> The expected behavior is that the validator recognizes that what is >>>> being >>>> XIncluded is XML (as it is by default) and goes through to validate >>>> that >>>> as well, silently replacing the xi:include element with whatever is >>>> XIncluded. I think that some parsers do this - perhaps only if an >>>> option is set - but most don't. >>> >>> >>> Does Xalan do it? This is the default parser for Forrest. A healthy >> >> >> >> Uhm, do you mean Xerces? From what I can see Xalan is unaware of >> XIncludes. > > > Yes, I often get Xerces and Xalan names mixed up, sorry. > >>>> A better approach would be to process the XIncludes before validation, >>>> stripping off the xlmns:xi attribute from the document element and >>>> replacing xi:includes with whatever they resolve to. This should be >>>> cheaper than it might seem: unless the xmlns:xi is present, the >>>> document is simply handed on to the validator untouched. >>> >>> >>> I can't see an easy way of doing this as, in many cases, the included >>> content is generated by Forrest. In fact, this would be a problem if the >>> parser were doing the includes. >> >> >> >> I am baffled. How would it be a problem if the parser was doing the >> XIncludes? > > > David points out in another message that the validate-xdocs is done > prior to Forrest doing any transformations on content, it only validates > the *source* documents. > > This means that if a source document XIncludes another source document > that is available statically on disk/network, as in your use case, then > the above will work OK. > > However, if a source document includes source content that is > dynamically generated, for example, pulled from a database/RSS Feed/Jira > instance etc. then we would have to fire up Forrest to generate these > sources. If we are validating source documents before we fire up Forrest > we end up in a catch 22. > > One solution would be to fire up a running instance of Forrest (aka > forrest run) and have Xerces validate the xincludes by retrieving them > from the running instance of Forrest. But this really is clumsy and I > would guess non-trivial. > > My point is, any solution that is created to better support the first > use case (including static content) must also work in the second use > case (including dynamic content). Let me clarify so as not to discourage your hunt for a solution... Any solution would need to work alongside a solution that works for the dynamically generated content stuff. Personally, I think being able to turn off validation on certain pages, as is currently the case, is just fine. Ross