Hi all,

 

I’ve been looking at how to add a HTML5 serializer to the project.

 

So far my investigations have led to add following code to org.apache.cocoon.sax.component.XMLSerializer

 

    public static XMLSerializer createHTML5Serializer() {

        XMLSerializer serializer = new XMLSerializer();

 

        serializer.setContentType(TEXT_HTML_UTF_8);

        serializer.setDoctypePublic("XSLT-compat");

        serializer.setEncoding(UTF_8);

        serializer.setMethod(HTML);

 

        return serializer;

    }

 

 

Using the HTML5 serializer in a test to print the output:

 

    @Test

    public void testHTML5Serializer() throws Exception {

        ByteArrayOutputStream baos = new ByteArrayOutputStream();

 

        newNonCachingPipeline()

        .setStarter(

           new XMLGenerator("<html><head><title>serializer test</title></head><body><p>test</p></body></html>")

        )

        .setFinisher(XMLSerializer.createHTML5Serializer())

        .withEmptyConfiguration()

        .setup(baos)

        .execute();

 

        String data = new String(baos.toByteArray());

        System.out.println(data);

}

 

Would print

 

<!DOCTYPE html PUBLIC "XSLT-compat">

<html>

<head>

<META http-equiv="Content-Type" content="text/html; charset=UTF-8">

<title>serializer test</title>

</head>

<body>

<p>test</p>

</body>

</html>

 

 

I read a number of articles describing the issues with serializing html5 and so far this was the best I could come up with which is not 100% conforming due to

ˇ         Non matching doctype although it will not break in the browser  ā should be <!DOCTYPE html>

ˇ         The charset should be <meta charset=”UTF-8”/> according to html5 spec

 

 

http://www.contentwithstyle.co.uk/content/xslt-and-html-5-problems/

http://www.w3schools.com/html5/tag_meta.asp

 

 

Does anyone have more knowledge on this subject?

 

Robby