commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Kitching <skitch...@apache.org>
Subject Re: [Digester] Unable to get validation working
Date Tue, 26 Apr 2005 06:22:40 GMT
On Mon, 2005-04-25 at 22:13 -0700, Craig McClanahan wrote:
> > 
> > The last piece of the puzzle though, which took about 45 minutes of
> > banging my head on the desk to figure out, was the DOCTYPE definition in
> > the XML file I was trying to validate.  I had to use:
> > 
> > <!DOCTYPE myConfig PUBLIC "myConfig" "myConfig">
> > 
> > It seemed more logical to use SYSTEM, but it kept trying to pull the
> > file from various locations on the local file system, as the name
> > indicates).  Even when I tried to get the path to point at the actual
> > file in WEB-INF for instance though, it would never work for one reason
> > or another.  This, even though it doesn't seem right to me (and maybe
> > it's not!) *does* work.
> > 
> 
> PUBLIC is definitely what you want for this sort of thing.  There's
> also conventional formats for the public identifier and system
> identifier if this DTD was going to be a public thing ... but it
> doesn't matter for application private things.

Yep. All XML document declarations really should have a PUBLIC
identifier. Ones that have only a system identifier are extremely
"fragile" in that they expect the machine they are processed on to have
the dtd available in a specific place.

The whole purpose of the PUBLIC identifier is to be able to redirect
requests for the associated dtd to some arbitrary file, in exactly the
way you want to do via the "Digester.register(publicId, URL)" method.

SYSTEM identifiers are not meant to be redirectable - they literally
reference a local file. This is why there is no
"Digester.register(systemId, URL)" method. If there is no public ID, or
the public ID is not "registered", then Digester just tries to use the
SYSTEM id as a literal reference to a file - which, in your case, it
couldn't find. Of course if you really are stuck with a bad xml file
that only has a SYSTEM id (this happens) you can write your own
EntityResolver rather than relying on Digester to do this work.

And as Craig says, the public ID really should be a much longer and much
more unique string than just "myConfig", so that software that deals
with many different xml document types can tell them apart. However if
this particular document is only expected to be fed into pieces of
software that deal with just this document type then you can get away
with this.

> 
> > Cool, one step closer to completion!
> > 

Good luck.

By the way, if you get confused you can always avoid Digester's
validation support and set things up the traditional way, ie you can
always do
  digester.setValidating(true);
  SAXParser parser = digester.getParser();
then call the SAXParser methods directly to configure validation (in
particular, setting a custom EntityResolver via
parser.setEntityResolver). This won't bother the digester at all;
digester really doesn't care much about validation as that occurs within
the parser before events are passed to Digester.


One other thing you should watch out for when enabling validation: by
default, Digester will ignore errors reported by the parser (other than
logging them). You will need to register a custom error handler in order
to take appropriate action when an error occurs (which probably just
means throwing the parameter object as an exception):
   digester.setErrorHandler(myCustomErrorHandler)
or
   digester.getParser().setErrorHandler(myCustomErrorHandler)
The only difference between these two is that the first allows Digester
to log a message before forwarding to your custom errorhandler.


Regards,

Simon


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Mime
View raw message