xml-commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Glavassevich <mrgla...@ca.ibm.com>
Subject Re: resolver should be able to parse catalog files without needing to resolve external entities?
Date Sat, 24 Oct 2009 19:10:58 GMT

The OASIS catalog DTD is included in resolver.jar and there is a
BootstrapResolver [1] which gets installed on the parser that reads the
catalog which can return this DTD. I'm sure the reason that isn't happening
is that the public and system IDs differ from the ones that the resolver
knows about. You're supposed to extend BootstrapResolver (in your own
application) if you need support for more than the well-known public IDs
and URIs for the catalog DTDs / schemas and set an instance of this
extension on the CatalogManager [2].

Thanks.

[1]
http://xml.apache.org/commons/components/apidocs/resolver/org/apache/xml/resolver/helpers/BootstrapResolver.html
[2]
http://xml.apache.org/commons/components/apidocs/resolver/org/apache/xml/resolver/CatalogManager.html#setBootstrapResolver
(org.apache.xml.resolver.helpers.BootstrapResolver)

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org

Earl Hood <earl@earlhood.com> wrote on 10/24/2009 01:22:50 PM:

> On October 23, 2009 at 17:19, someone wrote:
>
> > Here's an example of a catalog.xml file distributed in the Debian and
> > Ubuntu w3c-dtd-xhtml package,
> > http://www.sfu.ca/~jdbates/tmp/debian/200910230/catalog.xml
> >
> > It starts with,
> >
> > <?xml version='1.0'?>
> > <!DOCTYPE catalog PUBLIC "-//GlobalTransCorp//DTD XML Catalogs
V1.0-Based
> > <Extension V1.0//EN"
> >     "http://globaltranscorp.org/oasis/catalog/xml/tr9401.dtd";>
> > [...]
>
> I think they should fix it so the system identifier is set
> to a pathname on the local file system.
>
> Also, the public identifier used is not the standard public
> identifier, "-//OASIS//DTD XML Catalogs V1.1//EN".  So even
> if the resolver provided intrinsic recognition of
> the "-//OASIS//DTD XML Catalogs V1.1//EN" identifier, it
> would still be of no use in this case.
>
> One can argue that the w3c-dtd-xhtml package has a bug in
> their distribution since it provides no facility to resolve
> the DTD to the local file system.  The system identifier
> should be set to the pathname the catalog DTD is placed
> by the w3c-dtd-xhtml installer.
>
> > I understand comment #4,
> > https://bugs.launchpad.net/ubuntu/+source/w3c-dtd-
> xhtml/+bug/400259/comments/4
> >
> > - to be suggesting that org.apache.xml.resolver is not following the
> > encouragement of,
> > http://www.oasis-open.org/committees/download.php/14809/xml-
> catalogs.html#s.bootstrap
> >
> > "Implementations are encouraged to provide some sort of bootstrapping
> > functionality to resolve external identifiers and URIs that the
> > implementation needs to load catalog entry files.
>
> It is not a requirement:
>
>   Conformant processors are not required to be able to perform
>   resolution of those identifiers through the XML Catalog.
>
> The word "should" is used in other text instead of "must".  Also,
> the following is stated:
>
>   Users can avoid any problems that might arise by limiting the
>   external identifiers and URIs used to those that do not need
>   resolution. Note that this only applies to external identifiers and
>   URIs that must be resolved in order to load the catalog entry file.
>
> > - and to be suggesting that not following this encouragement is a bug
> >
> > Is maybe my understanding wrong - or either of these suggestions wrong?
>
> The recommendations of the Oasis document are beneficial, but
> they are only recommendations, not requirements.  So the "bug"
> reports are really enhancement requests.
>
> IMO, the work-around for the problem is easy, and is directly
> suggested by the Oasis document: Use system identifiers that
> are resolvable without the need of a catalog.
>
> I think the underlying technical problem of why the resolver library
> does not provide intrinsic resolution of the catalog DTD is that
> the library does not know where the DTD may be installed for any
> system that uses the resolver.  Since other software systems include
> the resolver in their distribution, the DTD itself may not even
> be available.
>
> A possible method of always knowing how to find the catalog DTD is
> for the resolver to include the DTD in the resolver.jar file itself.
> The resolver could register a custom (internal) resolver to the XML
> parser when reading catalog files so any references to the DTD can
> be resolved via a classpath resource lookup.  IMO, I'm not sure it
> is worth the effort to do this when simple work-arounds exist for
> the problem.
>
> I'm sure patches are welcome if anyone wants to implement this
> functionality.
>
> --ewh
Mime
View raw message