From Elliotte Rusty Harold <elh...@metalab.unc.edu>
Subject ID Transformer problem
Date Tue, 08 Jan 2002 03:49:30 GMT
I've been bouncing this around in various mailing lists, trying to 
pin down where the problem lies. I'm becoming convinced that this is 
a bug in Xalan-J (and SAXON too) though certainly the JAXP 
documentation could be clearer.

Suppose we want to use DOM and JAXP to produce the following document or its
equivalent, modulo white space and other insignificancies:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.0//EN"
<svg xmlns="http://www.w3.org/2000/svg"/>

I claim the following code should do the trick, given that impl is a
DOMImplementation object:

       // Create the document
       DocumentType svgDOCTYPE = impl.createDocumentType(
        "svg", "-//W3C//DTD SVG 1.0//EN",
       Document doc = impl.createDocument(
        "http://www.w3.org/2000/svg", "svg", svgDOCTYPE);

       // Serialize the document onto System.out
       TransformerFactory xformFactory
        = TransformerFactory.newInstance();
       Transformer idTransform = xformFactory.newTransformer();
       Source input = new DOMSource(doc);
       Result output = new StreamResult(System.out);
       idTransform.transform(input, output);

However, what this in fact produces is

<?xml version="1.0" encoding="UTF-8"?>

There are two problems here:

1. The namespace is lost on the svg element.
2. The DOCTYPE declaration is lost.

The details are a little implementation dependent. SAXON and Xalan-J 
both give these results. However, Gnu JAXP can mask it because it has 
a bug that cancels out this bug.

I think both of these are problems with the idTransform. According to
the JAXP specification, p. 62, "If all that is desired is the simple
identity transformation of a source to a result, then 
TransformerFactory90 provides a newTransformer()93 method with no 
arguments. This method creates a Transformer that effectively copies 
the source to the result. This method may be used to create a DOM 
from SAX events or to create an XML or HTML stream from a DOM or SAX 

This is less than perfectly clear on issues like whether it should
insert namespace attributes as necessary or include the DOCTYPE
declaration. However, I think that what's actually output really doesn't
strike me as a copy of the source to the result because a lot of
significant information has been lost. Its arguable whether the 
DOCTYPE declaration should go in. However, I think it's pretty clear 
that the namespace declaration should be there.

Yes, I know that I could add the xmlns attribute, at least, manually.
However, I think this should be closer to the default behavior of
XMLSerializer where namespace declaration attributes are inserted
automatically in the stream as necessary.

