cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robby Pelssers <Robby.Pelss...@nxp.com>
Subject HTML5 serializer
Date Fri, 06 Jan 2012 14:48:42 GMT
Hi all,

I've been looking at how to add a HTML5 serializer to the project.

So far my investigations have led to add following code to org.apache.cocoon.sax.component.XMLSerializer

    public static XMLSerializer createHTML5Serializer() {
        XMLSerializer serializer = new XMLSerializer();

        serializer.setContentType(TEXT_HTML_UTF_8);
        serializer.setDoctypePublic("XSLT-compat");
        serializer.setEncoding(UTF_8);
        serializer.setMethod(HTML);

        return serializer;
    }


Using the HTML5 serializer in a test to print the output:

    @Test
    public void testHTML5Serializer() throws Exception {
        ByteArrayOutputStream baos = new ByteArrayOutputStream();

        newNonCachingPipeline()
        .setStarter(
           new XMLGenerator("<html><head><title>serializer test</title></head><body><p>test</p></body></html>")
        )
        .setFinisher(XMLSerializer.createHTML5Serializer())
        .withEmptyConfiguration()
        .setup(baos)
        .execute();

        String data = new String(baos.toByteArray());
        System.out.println(data);
}

Would print

<!DOCTYPE html PUBLIC "XSLT-compat">
<html>
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>serializer test</title>
</head>
<body>
<p>test</p>
</body>
</html>


I read a number of articles describing the issues with serializing html5 and so far this was
the best I could come up with which is not 100% conforming due to

·         Non matching doctype although it will not break in the browser  --> should be
<!DOCTYPE html>

·         The charset should be <meta charset="UTF-8"/> according to html5 spec


http://www.contentwithstyle.co.uk/content/xslt-and-html-5-problems/
http://www.w3schools.com/html5/tag_meta.asp


Does anyone have more knowledge on this subject?

Robby



Mime
View raw message