Hi Lisa,

it may be a good idea to have a look at the 'xalan:entities' serializer property:
http://xml.apache.org/xalan-j/usagepatterns.html#outputprops
http://permalink.gmane.org/gmane.text.docbook.apps/20285

HTH...

Regards,
Sergey


On 07.08.13 20:32, Lisa Smith wrote:

Hello,

My XSLT transformations have been successful for months until I ran across an XML file with Unicode characters (emoji characters). I need to preserve the Unicode but XSLT is converting it to HTML Entities. I thought that setting the encoding to UTF-8 would solve my problem but I'm still having issues. 

If I set the output property OutputKeys.METHOD to "text" the emojis remain, however all of my XML elements are stripped.  When I set OutputKeys.METHOD to "xml" the emoji is transformed to HTML Entities.

Any help appreciated. Code:

private by
 te
[] transform(InputStream stream) throws Exception{
    System.setProperty("javax.xml.transform.TransformerFactory", "org.apache.xalan.processor.TransformerFactoryImpl"); 

    Transformer xmlTransformer;

    xmlTransformer = (TransformerImpl) TransformerFactory.newInstance().newTransformer(new   StreamSource(createXsltStylesheet()));
    xmlTransformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");

    XMLStreamReader reader = XMLInputFactory.newInstance().createXMLStreamReader(stream,"UTF-8");
    Source staxSource = new StAXSource(reader, true); 
    ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
    Writer writer = new OutputStreamWriter(outputStream, "UTF-8");
    xmlTransformer.transform(staxSource, new
 StreamResult(writer));


    return outputStream.toByteArray();
}


thanks!,

Lisa