cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Turner <je...@apache.org>
Subject Re: XML validation during Cocoon build
Date Mon, 21 Oct 2002 13:10:29 GMT
On Mon, Oct 21, 2002 at 08:31:15PM +1000, David Crossley wrote:
> Colin Paul Adams wrote:
> > >>>>> "David" == David Crossley <crossley@indexgeo.com.au>
writes:
> > 
> > David> We could. However, proper support for Entity Catalogs is
> > David> not yet in Ant. So we need to use a rudimentary catalog
> > David> facility which automatically builds an internal
> > David> catalog. This works, but is cumbersome. 
> > 
> > David> I still think that the Anteater discussion links that i
> > David> provided earlier in this thread is the most promising
> > David> option. This was building validation facilities for
> > David> Anteater which could also be used in Cocoon.
> > 
> > OK. Has this progressed at all?
> 
> I do not know, i just signed on to the aft-devel list
> to help out. Perhaps the others can say ... Ivelin, Jeff ...

No progress the specific topic of that email.  I think JARV [1] might be
a better set of interfaces to standardise on.

In the context of this discussion, Anteater is probably the wrong tool.
I think the best solution would be to add a DOCTYPE declaration to the
sitemap and let the parser validate.  This has the added benefit that
users with catalog-aware editors [2] can validate as they edit.

> > I don't necessarily suggest integrating it into CVS, as it will
> > involve adding DOCTYPEs to all the sitemap.xmap files, and this might
> > add extra overhead during parsing.
> 
> I wondered about that too. How often does a sitemap get parsed? Perhaps
> the overhead is immaterial.

If performance becomes a problem, we can add a switch to cocoon.xconf
which turns off sitemap validation.  I wrote a tool,
http://doctypechanger.sf.net for programmatically stripping off a DOCTYPE
declaration, which is the only way to prevent DTD parsing.  If we want to
go this route, I can suggest integrating this 'switch-off-DTDs' flag into
the o.a.e.xml.Parser implementation in Excalibur.  It could then be
exposed in cocoon.xconf as something like:

<xml-parser ...
  <parameter name="validate" value="false"/>
  <strip-doctypes>

    <!-- Don't validate sitemaps -->
    <publicId>-//APACHE//DTD Cocoon Sitemap V1.0//EN</publicId>

    <!-- Don't validate treeprocessor-builtins.xml -->
    <rootElement>tree-processor</rootElement>

    <!-- DO validate Forrest docs.. 
    <publicId>-//APACHE//DTD XML Documentation V1.1//EN</publicId>
    -->
    ....
  </strip-doctypes>
</xml-parser>


In the long term it might to better to abandon DTDs and this silly idea
of parse-time validation altogether.  But since Colin went to all the
trouble of writing a DTD, it would be good to use it :)


--Jeff

[1] http://iso-relax.sourceforge.net/JARV/ 
[2] http://xml.apache.org/forrest/your-project.html#N102AD

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Mime
View raw message