xml-xalan-j-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Raber <cra...@avantgo.com>
Subject RE: How to turn of validation and resolution of DTD entities?
Date Tue, 31 Jul 2001 14:34:14 GMT

Thanks for the pointers. A little sharing for the benefit of others that
might struggle with this too:

I did a short-cut version of what you suggested:

                // Added to hook in my EntityResolver
		InputSource inputSource = new InputSource(reader);

		XMLReader xmlReader = XMLReaderFactory.createXMLReader();

                // END Added to hook in my EntityResolver

		SAXSource saxSource = new SAXSource(xmlReader, inputSource);

		transformer.transform(saxSource, new

Where enityResolver is an instance of MyEntityResolver:

    public class MyEntityResolver implements org.xml.sax.EntityResolver {

        // Maintains a cache of entities so they only have to be fetched
        // Each cache entry contains a byte array representing the contents
        // the entity, keyed by systemId.
	Map entityCache;

        // Constructor
	public MyEntityResolver(){
            // Initialize entityCache. Synchronized because multiple threads
may be reading/writing the cache...
	    entityCache = Collections.synchronizedMap(new HashMap());

        // resolve and entity based on its systemId. Not sure what publicId
is for here?
	public org.xml.sax.InputSource resolveEntity (String publicId,
String systemId) 
	          throws org.xml.sax.SAXException, java.io.IOException {

	    BufferedInputStream sourceIn;

            // Look in cache to see if we have already fetched this baby
	    byte bytesIn[] = (byte[])entityCache.get(systemId);

            // If not previously fetched, fetch and cache.
	    if(bytesIn == null){

                // Look in the configuration to see if this entity has been
mapped to a local copy.
		String localCopy = config.getEntity(systemId);
		if(localCopy != null){
		    // return a special input source
		    String localFileName =
		    sourceIn = new BufferedInputStream(new
		} else {
                    // Assumes the systemId is a URL...
		    URL u = new URL(systemId);
		    sourceIn = new BufferedInputStream(u.openStream());

                // Read the bytes from the entity into a byte array and
cache it.       
                // This assumes available() returns the total number of
bytes available
                // in the underlying resource.
	    int numBytes = sourceIn.available();
	    bytesIn = new byte[numBytes];
                int offset = 0;
                while (numBytes > 0){
                    int numBytesRead = sourceIn.read(bytesIn, offset,
                    numBytes -= numBytesRead;
                    offset += numBytesRead;
                // Cache bytes for this entity...
	        entityCache.put(systemId, bytesIn);
	    return new org.xml.sax.InputSource(new

MyEntityResolver does the following:

- Looks in local storage for entities that are specified in its
configuration. The config object above contains this information which was
read from an XML config file...
- Otherwise fetches the entity via its URL.
- Caches the bytes of each entity for quick retrieval. IO to the source
entity is only done once.

Thanks for the suggestions!



Chris Raber, Systems Engineer, AvantGo Inc.
v: 248-554-9330, cell: 810-839-3684

-----Original Message-----
From: Gary L Peskin [mailto:garyp@firstech.com]
Sent: Thursday, July 26, 2001 12:26 AM
To: Raber Chris
Cc: xalan-j-users@xml.apache.org
Subject: Re: How to turn of validation and resolution of DTD entities?

Chris --

There are a few things you need to consider when using EntityResolvers:
(1)  Two DOM (or DTM) trees are built:  one for the stylesheet and one
for the input document.  Do you want an EntityResolver for both or do
you only need it for one or the other?  I'm going to assume in this
example that you only want it for the input document.  If you need it
for the stylesheet, we'll have to jazz up this example.
(2)  At stylesheet creation time, additional stylesheets can be brought
in using xsl:include and xsl:import.  New readers are created for these
things that don't use your EntityResolver.  To trap this, you'll need to
create a URIResolver that creates and returns a SAXSource that has your
EntityResolver hooked into that XMLReader.  The same holds true for XML
input documents brought in at runtime with the document() function.

So, for this simple example, I'll assume that you only want the
EntityResolver at runtime and that there are no input documents brought
in with the document() function.

I'd code it like this (none of this is tested but should work :)).  It's
taken more or less from the "Usage Patterns" page at

import javax.xml.transform.TransformerFactory;
import javax.xml.transform.sax.SAXTransformerFactory;
import javax.xml.transform.sax.TransformerHandler;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.XMLReaderFactory.createXMLReader
import org.apache.xalan.serialize.SerializerFactory;
import org.apache.xalan.serialize.Serializer;
import org.apache.xalan.templates.OutputProperties;
import javax.xml.transform.Result;
import javax.xml.transform.sax.SAXResult

// Instantiate a TransformerFactory.
TransformerFactory tFactory = TransformerFactory.newInstance();

// Cast the TransformerFactory to SAXTransformerFactory.
SAXTransformerFactory saxTFactory = (SAXTransformerFactory) Factory;

// Create a Transformer ContentHandler to handle transformation of the
XML Source
TransformerHandler transformerHandler
              = saxTFactory.newTransformerHandler(new

// Create an XMLReader and set its ContentHandler.
XMLReader reader = XMLReaderFactory.createXMLReader();

// Set the ContentHandler to also function as a LexicalHandler, which
// can process "lexical" events (such as comments and CDATA).

// Set your EntityResolver into the reader

// Set up a Serializer to serialize the Result to a file.
Serializer serializer = SerializerFactory.getSerializer
serializer.setOutputStream(new java.io.FileOutputStream("foo.out"));

// The Serializer functions as a SAX ContentHandler.
Result result = new SAXResult(serializer.asContentHandler());
// Parse the XML input document.


Raber Chris wrote:
> I have a need to turn off resolution of DTD entities
> when not connected to a network. Also I am thinking
> that hitting http://www.w3.org/ every time we bump
> into a DTD reference is a lot of overhead anyway. =:-o
> Based on a bit of Googling, it appears that
> implementing an EntityResolver that redirects remote
> TCP/IP destinations to a local cache is the ticket. If
> there is another/easier way, please advise.
> Currently I am using StreamSource and StreamResult as
> arguments to Transformer.transform, which is most
> convenient. I'd like to avoid hooking together an
> underlying parser, etc., if possible. I've really
> appreciated the simplcity of using the higher level
> Trax apis, and would like to stay there if I can.
> Simple good...
> Is there a way to hook in my own
> org.xml.sax.EntityResolver via a property, or must I
> instantiate my own underlying SAX/DOM handlers... and
> explictly call setEntityResolver an the XMLReaders?
> If the latter, can someone provide basic instructions
> on how to string this together? Is the SAX2SAX example
> a good place to start?
> And does anyone have an example EntityResolver they
> would be willing to share?
> TIA,
> -Chris.
> PS: It would be real cool if it were possible to hook
> this via property settings...
> __________________________________________________
> Do You Yahoo!?
> Make international calls for as low as $.04/minute with Yahoo! Messenger
> http://phonecard.yahoo.com/

View raw message