commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From robert burrell donkin <>
Subject Re: [digester] can't resolve relative entities ?
Date Mon, 29 Mar 2004 20:09:07 GMT
On 29 Mar 2004, at 20:52, robert burrell donkin wrote:

> On 29 Mar 2004, at 18:52, Craig McClanahan wrote:
>> Paul Libbrecht wrote:
>>> I think Digester.parse( should do it for me, or?
>>> (this method does build an input-source with correct URL, btw)
>>> There's even, in the maven code, efforts towards making this an 
>>> absolute path.
>> In theory it should ... but if it doesn't, you can easily construct a 
>> URL for a file and use the technique I described.
>>> But the problem remains: if you look at the code of, 
>>> there's nothing that keeps the URL of the file! And the call to the 
>>> method configure() is without any parameter!
>> But that's a feature, not a bug :-).  No code in Digester is 
>> necessary, because it's all handled by the SAX parser underneath.
>>> I do think, contrary to what Robert claims, that XML-compliance 
>>> requires relative-system-id-entities to be resolved completely as 
>>> long as we have a URL.
>> Correct relative entity resolution also requires users to correctly 
>> utilize what the JAXP APIs provide.  If you don't provide an absolute 
>> URL for the document being parsed, relative URL references will fail. 
>>  If you do provide an absolute URL, entity references will work in a 
>> manner totally transparent to Digester, because this is a feature 
>> built in to the underlying SAX based parser.
> '../whatever.dtd' is not an url. XML parsers can therefore reject it 
> and still be specification compliant. (the url should be something 
> like 'file:../whatever.dtd'.) digester makes an attempt to resolve the 
> url in the standard java way which is more than the xml specification 
> requires in this case.

i should probably admit my mistake before others pick it up. relative 
urls do not need a scheme prefix but an entity resolver cannot know the 
base protocol. this is probably why the SAX specification says that 
only fully resolved URLs should be passed in (i take this to mean 
absolute URLs).

IMO the only safe way for digester to deal with relative URLs is to 
return null and leave the parser to try to sort out the mess. the only 
issue would be sort out the relative URLs from badly formed absolute 
file URLs such as 'C:\whatever.dtd'. i don't think that returning null 
in the case of a badly formed URL should make much difference (most 
parsers should find the file in question) but i suppose digester could 
test for the existing of a file (if people think that this is a serious 

- robert

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message