Return-Path: Delivered-To: apmail-jakarta-commons-dev-archive@www.apache.org Received: (qmail 1043 invoked from network); 29 Mar 2004 20:09:20 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 29 Mar 2004 20:09:20 -0000 Received: (qmail 19249 invoked by uid 500); 29 Mar 2004 20:09:06 -0000 Delivered-To: apmail-jakarta-commons-dev-archive@jakarta.apache.org Received: (qmail 19194 invoked by uid 500); 29 Mar 2004 20:09:05 -0000 Mailing-List: contact commons-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Jakarta Commons Developers List" Reply-To: "Jakarta Commons Developers List" Delivered-To: mailing list commons-dev@jakarta.apache.org Received: (qmail 19181 invoked from network); 29 Mar 2004 20:09:05 -0000 Received: from unknown (HELO smtp-out2.blueyonder.co.uk) (195.188.213.5) by daedalus.apache.org with SMTP; 29 Mar 2004 20:09:05 -0000 Received: from [10.0.0.2] ([82.38.65.173]) by smtp-out2.blueyonder.co.uk with Microsoft SMTPSVC(5.0.2195.5600); Mon, 29 Mar 2004 21:09:10 +0100 Mime-Version: 1.0 (Apple Message framework v613) In-Reply-To: References: <24F035EA-7F86-11D8-B6A9-000A95C50B1C@dfki.de> <4067A340.4060802@apache.org> <40686271.5050508@apache.org> Content-Type: text/plain; charset=US-ASCII; format=flowed Message-Id: Content-Transfer-Encoding: 7bit From: robert burrell donkin Subject: Re: [digester] can't resolve relative entities ? Date: Mon, 29 Mar 2004 21:09:07 +0100 To: "Jakarta Commons Developers List" X-Mailer: Apple Mail (2.613) X-OriginalArrivalTime: 29 Mar 2004 20:09:10.0843 (UTC) FILETIME=[B2F818B0:01C415C9] X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N On 29 Mar 2004, at 20:52, robert burrell donkin wrote: > On 29 Mar 2004, at 18:52, Craig McClanahan wrote: >> Paul Libbrecht wrote: >> >>> I think Digester.parse(java.io.File) should do it for me, or? >>> (this method does build an input-source with correct URL, btw) >>> There's even, in the maven code, efforts towards making this an >>> absolute path. >>> >> In theory it should ... but if it doesn't, you can easily construct a >> URL for a file and use the technique I described. >> >>> But the problem remains: if you look at the code of Digester.java, >>> there's nothing that keeps the URL of the file! And the call to the >>> method configure() is without any parameter! >>> >> But that's a feature, not a bug :-). No code in Digester is >> necessary, because it's all handled by the SAX parser underneath. >> >>> I do think, contrary to what Robert claims, that XML-compliance >>> requires relative-system-id-entities to be resolved completely as >>> long as we have a URL. >>> >> Correct relative entity resolution also requires users to correctly >> utilize what the JAXP APIs provide. If you don't provide an absolute >> URL for the document being parsed, relative URL references will fail. >> If you do provide an absolute URL, entity references will work in a >> manner totally transparent to Digester, because this is a feature >> built in to the underlying SAX based parser. > > '../whatever.dtd' is not an url. XML parsers can therefore reject it > and still be specification compliant. (the url should be something > like 'file:../whatever.dtd'.) digester makes an attempt to resolve the > url in the standard java way which is more than the xml specification > requires in this case. i should probably admit my mistake before others pick it up. relative urls do not need a scheme prefix but an entity resolver cannot know the base protocol. this is probably why the SAX specification says that only fully resolved URLs should be passed in (i take this to mean absolute URLs). IMO the only safe way for digester to deal with relative URLs is to return null and leave the parser to try to sort out the mess. the only issue would be sort out the relative URLs from badly formed absolute file URLs such as 'C:\whatever.dtd'. i don't think that returning null in the case of a badly formed URL should make much difference (most parsers should find the file in question) but i suppose digester could test for the existing of a file (if people think that this is a serious problem). - robert --------------------------------------------------------------------- To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: commons-dev-help@jakarta.apache.org