xml-xalan-j-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary L Peskin <ga...@firstech.com>
Subject Re: How to turn of validation and resolution of DTD entities?
Date Tue, 31 Jul 2001 17:03:54 GMT
Glad it worked out.  Thanks for sharing this info back with the list.

Gary

> Christopher Raber wrote:
> 
> Gary,
> 
> Thanks for the pointers. A little sharing for the benefit of others
> that might struggle with this too:
> 
> I did a short-cut version of what you suggested:
> 
>                 // Added to hook in my EntityResolver
>                 InputSource inputSource = new InputSource(reader);
> 
>                 XMLReader xmlReader =
> XMLReaderFactory.createXMLReader();
> 
>                 xmlReader.setEntityResolver(entityResolver);
>                 // END Added to hook in my EntityResolver
> 
>                 SAXSource saxSource = new SAXSource(xmlReader,
> inputSource);
> 
>                 transformer.transform(saxSource, new
> StreamResult(xalanOutStream));
> 
> Where enityResolver is an instance of MyEntityResolver:
> 
>     public class MyEntityResolver implements
> org.xml.sax.EntityResolver {
> 
>         // Maintains a cache of entities so they only have to be
> fetched once.
>         // Each cache entry contains a byte array representing the
> contents of
>         // the entity, keyed by systemId.
>         Map entityCache;
> 
>         // Constructor
>         public MyEntityResolver(){
>             // Initialize entityCache. Synchronized because multiple
> threads may be reading/writing the cache...
>             entityCache = Collections.synchronizedMap(new HashMap());
>         }
> 
>         // resolve and entity based on its systemId. Not sure what
> publicId is for here?
>         public org.xml.sax.InputSource resolveEntity (String publicId,
> String systemId)
>                   throws org.xml.sax.SAXException, java.io.IOException
> {
> 
>             BufferedInputStream sourceIn;
> 
>             // Look in cache to see if we have already fetched this
> baby before.
>             byte bytesIn[] = (byte[])entityCache.get(systemId);
> 
>             // If not previously fetched, fetch and cache.
>             if(bytesIn == null){
> 
>                 // Look in the configuration to see if this entity has
> been mapped to a local copy.
>                 String localCopy = config.getEntity(systemId);
>                 if(localCopy != null){
>                     // return a special input source
>                     String localFileName =
> config.getCachedEntityDir()+localCopy;
>                     sourceIn = new BufferedInputStream(new
> FileInputStream(localFileName));
>                 } else {
>                     // Assumes the systemId is a URL...
>                     URL u = new URL(systemId);
>                     sourceIn = new
> BufferedInputStream(u.openStream());
>                 }
> 
>                 // Read the bytes from the entity into a byte array
> and cache it.
>                 // This assumes available() returns the total number
> of bytes available
>                 // in the underlying resource.
>             int numBytes = sourceIn.available();
>             bytesIn = new byte[numBytes];
>                 int offset = 0;
>                 while (numBytes > 0){
>                     int numBytesRead = sourceIn.read(bytesIn, offset,
> numBytes);
>                     numBytes -= numBytesRead;
>                     offset += numBytesRead;
>                 }
>                 // Cache bytes for this entity...
>                 entityCache.put(systemId, bytesIn);
>             }
>             return new org.xml.sax.InputSource(new
> ByteArrayInputStream(bytesIn));
>         }
>     }
> 
> MyEntityResolver does the following:
> 
> - Looks in local storage for entities that are specified in its
> configuration. The config object above contains this information which
> was read from an XML config file...
> 
> - Otherwise fetches the entity via its URL.
> - Caches the bytes of each entity for quick retrieval. IO to the
> source entity is only done once.
> 
> Thanks for the suggestions!
> 
> Regards,
> 
> -Chris.
> 
> Chris Raber, Systems Engineer, AvantGo Inc.
> v: 248-554-9330, cell: 810-839-3684
> http://www.avantgo.com/
> 
> -----Original Message-----
> From: Gary L Peskin [mailto:garyp@firstech.com]
> Sent: Thursday, July 26, 2001 12:26 AM
> To: Raber Chris
> Cc: xalan-j-users@xml.apache.org
> Subject: Re: How to turn of validation and resolution of DTD entities?
> 
> Chris --
> 
> There are a few things you need to consider when using
> EntityResolvers:
> (1)  Two DOM (or DTM) trees are built:  one for the stylesheet and one
> 
> for the input document.  Do you want an EntityResolver for both or do
> you only need it for one or the other?  I'm going to assume in this
> example that you only want it for the input document.  If you need it
> for the stylesheet, we'll have to jazz up this example.
> (2)  At stylesheet creation time, additional stylesheets can be
> brought
> in using xsl:include and xsl:import.  New readers are created for
> these
> things that don't use your EntityResolver.  To trap this, you'll need
> to
> create a URIResolver that creates and returns a SAXSource that has
> your
> EntityResolver hooked into that XMLReader.  The same holds true for
> XML
> input documents brought in at runtime with the document() function.
> 
> So, for this simple example, I'll assume that you only want the
> EntityResolver at runtime and that there are no input documents
> brought
> in with the document() function.
> 
> I'd code it like this (none of this is tested but should work :)).
> It's
> taken more or less from the "Usage Patterns" page at
> http://xml.apache.org/xalan-j/usagepatterns.html#sax
> 
> import javax.xml.transform.TransformerFactory;
> import javax.xml.transform.sax.SAXTransformerFactory;
> import javax.xml.transform.sax.TransformerHandler;
> import org.xml.sax.XMLReader;
> import org.xml.sax.helpers.XMLReaderFactory.createXMLReader
> import org.apache.xalan.serialize.SerializerFactory;
> import org.apache.xalan.serialize.Serializer;
> import org.apache.xalan.templates.OutputProperties;
> import javax.xml.transform.Result;
> import javax.xml.transform.sax.SAXResult
> 
> // Instantiate a TransformerFactory.
> TransformerFactory tFactory = TransformerFactory.newInstance();
> 
> // Cast the TransformerFactory to SAXTransformerFactory.
> SAXTransformerFactory saxTFactory = (SAXTransformerFactory) Factory;
> 
> // Create a Transformer ContentHandler to handle transformation of the
> 
> XML Source
> TransformerHandler transformerHandler
>               = saxTFactory.newTransformerHandler(new
> StreamSource("foo.xsl"));
> 
> // Create an XMLReader and set its ContentHandler.
> XMLReader reader = XMLReaderFactory.createXMLReader();
> reader.setContentHandler(transformerHandler);
> 
> // Set the ContentHandler to also function as a LexicalHandler, which
> // can process "lexical" events (such as comments and CDATA).
> reader.setProperty("http://xml.org/sax/properties/lexical-handler",
>                         transformerHandler);
> 
> // Set your EntityResolver into the reader
> reader.setEntityResolver(myEntityResolver);
> 
> // Set up a Serializer to serialize the Result to a file.
> Serializer serializer = SerializerFactory.getSerializer
> 
> (OutputProperties.getDefaultMethodProperties("xml"));
> serializer.setOutputStream(new java.io.FileOutputStream("foo.out"));
> 
> // The Serializer functions as a SAX ContentHandler.
> Result result = new SAXResult(serializer.asContentHandler());
> transformerHandler.setResult(result);
> 
> // Parse the XML input document.
> reader.parse("foo.xml");
> 
> HTH,
> Gary
> 
> Raber Chris wrote:
> >
> > I have a need to turn off resolution of DTD entities
> > when not connected to a network. Also I am thinking
> > that hitting http://www.w3.org/ every time we bump
> > into a DTD reference is a lot of overhead anyway. =:-o
> >
> > Based on a bit of Googling, it appears that
> > implementing an EntityResolver that redirects remote
> > TCP/IP destinations to a local cache is the ticket. If
> > there is another/easier way, please advise.
> >
> > Currently I am using StreamSource and StreamResult as
> > arguments to Transformer.transform, which is most
> > convenient. I'd like to avoid hooking together an
> > underlying parser, etc., if possible. I've really
> > appreciated the simplcity of using the higher level
> > Trax apis, and would like to stay there if I can.
> > Simple good...
> >
> > Is there a way to hook in my own
> > org.xml.sax.EntityResolver via a property, or must I
> > instantiate my own underlying SAX/DOM handlers... and
> > explictly call setEntityResolver an the XMLReaders?
> >
> > If the latter, can someone provide basic instructions
> > on how to string this together? Is the SAX2SAX example
> > a good place to start?
> >
> > And does anyone have an example EntityResolver they
> > would be willing to share?
> >
> > TIA,
> >
> > -Chris.
> >
> > PS: It would be real cool if it were possible to hook
> > this via property settings...
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Make international calls for as low as $.04/minute with Yahoo!
> Messenger
> > http://phonecard.yahoo.com/

Mime
View raw message