Return-Path: Delivered-To: apmail-jakarta-commons-user-archive@www.apache.org Received: (qmail 35074 invoked from network); 14 Jul 2004 23:52:36 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 14 Jul 2004 23:52:36 -0000 Received: (qmail 59452 invoked by uid 500); 14 Jul 2004 23:52:32 -0000 Delivered-To: apmail-jakarta-commons-user-archive@jakarta.apache.org Received: (qmail 59356 invoked by uid 500); 14 Jul 2004 23:52:32 -0000 Mailing-List: contact commons-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Jakarta Commons Users List" Reply-To: "Jakarta Commons Users List" Delivered-To: mailing list commons-user@jakarta.apache.org Received: (qmail 59342 invoked by uid 99); 14 Jul 2004 23:52:32 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [202.135.116.201] (HELO unxcoms01.ecnetwork.co.nz) (202.135.116.201) by apache.org (qpsmtpd/0.27.1) with ESMTP; Wed, 14 Jul 2004 16:52:30 -0700 Received: from serpent.ecnetwork.co.nz (serpent [202.135.190.10]) by unxcoms01.ecnetwork.co.nz (8.12.8/8.12.8) with ESMTP id i6ENqSXL016107 for ; Thu, 15 Jul 2004 11:52:28 +1200 Received: from pcjohns.ecnnz.ecnetwork.co.nz (unknown [202.135.190.30]) by serpent.ecnetwork.co.nz (Postfix) with ESMTP id 8635F1035 for ; Thu, 15 Jul 2004 11:52:30 +1200 (NZST) Subject: Re: [Digester] How do I get Digester to ignore the tag From: Simon Kitching To: Jakarta Commons Users List In-Reply-To: <40F5A4F7.5070707@apache.org> References: <40F5A4F7.5070707@apache.org> Content-Type: text/plain Message-Id: <1089849147.19268.17.camel@pcsimon> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 Date: Thu, 15 Jul 2004 11:52:27 +1200 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N On Thu, 2004-07-15 at 09:26, Craig McClanahan wrote: > Paolo Valladolid wrote: > > >I need to use Digester to parse XML that has been retrieved from a > >database. The XML I'm working with was received from elsewhere (ie. Not > >created by our team). How do I get Digester to ignore the > >tag? I've tried setValidating( false ) and it did not work. > > > > > > > The setValidating(false) call does indeed tell Digester to not validate > the XML data. However, it does *not* tell the underlying XML parser to > skip the DOCTYPE, and there is no API in JAXP to say that sort of thing. > > If your problem is unresolved entities, one thing you can do is to > provide your own EntityResolver method whose resolveEntity() method > always returns null. That way, the parser won't go traipsing around the > network trying to find things that it can't. Hi Paolo, I'm presuming the problem is that you have a DOCTYPE like this: and want to suppress loading of the referenced document, or have a DTD which declares and want to suppress loading of the entity. In other words, you don't want to ignore the DOCTYPE, you want to suppress loading of external entities. Craig's suggestion of writing an EntityResolver will work, but he has made a minor mistake: if you return *null* from the entity resolver class, then the parser will apply its normal resolving rules, including retrieving the entity (eg DTD) from the specified URL. This is explicitly stated in the javadoc for the org.xml.sax.EntityResolver class. In order to ignore remote entities, you can instead get your EntityResolver to return an InputSource that wraps an empty InputStream. Note, however, that this can change the *meaning* of your xml document. For example, if the DTD defines an implied value for an attribute, then ignoring the DTD will result in the attribute not getting its expected value. In general, it is better to ensure you have a local copy of the DTD, then use an EntityResolver to return the local DTD rather than returning an empty string. Still, if you *know* that the DTD doesn't have this sort of stuff in it, returning an InputSource which wraps an empty stream will work ok. If you happen to know that the underlying xml parser is Xerces then you can use the setFeature method to disable loading of DTDs. However this is parser-specific. See the xerces documentation on "features" for more info. By the way, this is nothing to do with the Digester; it is related to JAXP parsing in general. So you may be better off asking this on a list for xml parsing & JAXP. Regards, Simon --------------------------------------------------------------------- To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: commons-user-help@jakarta.apache.org