I've found that it's useful to String.trim() before sending to the XSLT engine/XML parser. We had a problem awhile back with Oracle adding a 0x0 "bonus character" to the end of XML snippets extracted from the database. Trimming the snippets before inserting them into the document cured the problem.
 
JLS
----- Original Message -----
From: Andy Heninger
To: general@xml.apache.org
Sent: Wednesday, May 02, 2001 11:58 AM
Subject: Re: Unicode problem

From the XML spec,  http://www.w3.org/TR/REC-xml#charsets
 
[2]    Char    ::=    #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */
 
0x1e is not in the list, so if one of your new data files happens to contain one, an invalid XML character error would be expected result.
 

Andy Heninger
IBM, Cupertino, CA
heninger@us.ibm.com
----- Original Message -----
From: Jonathan Cates
To: general@xml.apache.org
Sent: Monday, April 30, 2001 5:37 PM
Subject: Unicode problem

I am working on a project that is using the German language.  All our xml is
supposed to be headed with iso-8859-1.  Some data was recently loaded to the
database, and I am suddenly getting the following exception:

SystemId Unknown; Line 292; Column 24; ; Line#: 292; Column#: 24
javax.xml.transform.TransformerException: An invalid XML character (Unicode:
0x1e) was found in the element content of the document.
        at
org.apache.xalan.transformer.TransformerImpl.transform(TransformerImpl.java:
660)
        at
org.apache.xalan.transformer.TransformerImpl.transform(TransformerImpl.java:
1118)


Where the code looks like:
public void process(Source xml, Source xsl, Writer out){
        try{

             TransformerFactory tFactory;
             Transformer serializer;

                         tFactory = TransformerFactory.newInstance();

            serializer = tFactory.newTransformer(xsl);
            serializer.setOutputProperty(OutputKeys.ENCODING,"iso-8859-1");
            serializer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION,
"yes");
            serializer.transform(xml ,new StreamResult(out));
        }catch(Exception ex){
            ex.printStackTrace();

....

Is there something I have missed here.  If the doc doesn't have the
encoding="iso-8859-1" should this matter if I explictly set it?  I am using
v2 of xalan/xerces.  Any help is appreciated.

Thanks
Jon