xml-commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Earl Hood <e...@earlhood.com>
Subject Re: resolver should be able to parse catalog files without needing to resolve external entities?
Date Fri, 23 Oct 2009 21:47:00 GMT
On October 23, 2009 at 09:52, Jack Bates wrote:

> I'm getting the following exception using the XML catalog resolver with
> FOP,

>         file:/usr/share/xml/xhtml/schema/dtd/catalog.xml
> resolvePublic(-//W3C//DTD XHTML 1.0 Transitional//EN,null)
> Switching to delegated catalog(s):
>         file:/usr/share/xml/xhtml/schema/dtd/1.0/catalog.xml
> Parse catalog: file:/usr/share/xml/xhtml/schema/dtd/1.0/catalog.xml
> Loading catalog: file:/usr/share/xml/xhtml/schema/dtd/1.0/catalog.xml
> Default BASE: file:/usr/share/xml/xhtml/schema/dtd/1.0/catalog.xml
> 22-Oct-2009 5:04:13 PM org.apache.fop.cli.Main startFOP
> SEVERE: Exception
> javax.xml.transform.TransformerException: java.net.UnknownHostException: 
> globaltranscorp.org

Is globaltranscorp.org resolvable on the system you are running FOP?
Apparently the underlying network library of the JVM cannot resolve
the hostname.

As for the "resolve external entities" problem, it appears to
be a chicken-n-egg problem.  The resolver depends on the underlying
XML parser to parse the XML catalog file, but at that time,
the base entity resolution of the XML parser is being used since
the resolver is still bootstrapping itself.

If your catalog file contains a DOCTYPE declaration with a public and
system identifier, then the XML parser will try to resolve it, and if
the system identifier listed is not accessible, you will get an error.

All of this is a function of the XML parser itself and NOT the
resolver library.

In practice, I normally do not specify a doctype declaration for
catalog files to avoid the unnecessary overhead of parsing a DTD.

If you absolutely need to have DTD validation of your catalog files,
make sure the system identifier is resolvable, and preferably to a
location on the local file system for better performance and to avoid
dependency on a remote system.


View raw message