From dev-return-103307-apmail-cocoon-dev-archive=cocoon.apache.org@cocoon.apache.org Fri Jan 6 15:51:28 2012 Return-Path: X-Original-To: apmail-cocoon-dev-archive@www.apache.org Delivered-To: apmail-cocoon-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DB2DEB90E for ; Fri, 6 Jan 2012 15:51:28 +0000 (UTC) Received: (qmail 30555 invoked by uid 500); 6 Jan 2012 15:51:28 -0000 Delivered-To: apmail-cocoon-dev-archive@cocoon.apache.org Received: (qmail 30502 invoked by uid 500); 6 Jan 2012 15:51:27 -0000 Mailing-List: contact dev-help@cocoon.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: Reply-To: dev@cocoon.apache.org List-Id: Delivered-To: mailing list dev@cocoon.apache.org Received: (qmail 30495 invoked by uid 99); 6 Jan 2012 15:51:27 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Jan 2012 15:51:27 +0000 X-ASF-Spam-Status: No, hits=-0.1 required=5.0 tests=HTML_MESSAGE,LOTS_OF_MONEY,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of j.joachimsthal@1hippo.com designates 64.18.2.155 as permitted sender) Received: from [64.18.2.155] (HELO exprod7og101.obsmtp.com) (64.18.2.155) by apache.org (qpsmtpd/0.29) with SMTP; Fri, 06 Jan 2012 15:51:21 +0000 Received: from mail-tul01m020-f174.google.com ([209.85.214.174]) (using TLSv1) by exprod7ob101.postini.com ([64.18.6.12]) with SMTP ID DSNKTwcYY+4x6+hCLG1f0Hn4or7FzusKk3uA@postini.com; Fri, 06 Jan 2012 07:51:00 PST Received: by mail-tul01m020-f174.google.com with SMTP id wo16so2001480obc.19 for ; Fri, 06 Jan 2012 07:50:59 -0800 (PST) Received: by 10.182.197.10 with SMTP id iq10mr5184953obc.75.1325865059157; Fri, 06 Jan 2012 07:50:59 -0800 (PST) MIME-Version: 1.0 Received: by 10.182.109.73 with HTTP; Fri, 6 Jan 2012 07:50:38 -0800 (PST) In-Reply-To: <78B923726E7D59429936580CF127E943A1F4E6121C@eu1rdcrdc1wx032.exi.nxp.com> References: <78B923726E7D59429936580CF127E943A1F4E6121C@eu1rdcrdc1wx032.exi.nxp.com> From: Jasha Joachimsthal Date: Fri, 6 Jan 2012 16:50:38 +0100 Message-ID: Subject: Re: HTML5 serializer To: dev@cocoon.apache.org Content-Type: multipart/alternative; boundary=14dae9399d5fdf060f04b5de0371 X-Virus-Checked: Checked by ClamAV on apache.org --14dae9399d5fdf060f04b5de0371 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Hey Robby, which Cocoon version are you using for your project? In C2.1 and C2.2 there's not only a XMLSerializer but also an HTMLSerializer and XHTMLSerializer for their specific needs. So why not create your own HTML5Serializer? In HTML5 the specification teams tried to specify what browsers were already doing instead of making a new theoretical specification. HTML5 should be backwards compatible with previous (X)HTML versions. This is the reason why some old elements are not deprecated but considered obsolete (remember marquee, it was so cool on Geocities). The doctype doesn't really matter, browsers generally ignore the PUBLIC part in the doctype (apart from some hacks in IE going into quirks mode). A good presentation about HTML5 is http://vimeo.com/15755349. Jasha Joachimsthal Europe - Amsterdam - Oosteinde 11, 1017 WT Amsterdam - +31(0)20 522 4466 US - Boston - 1 Broadway, Cambridge, MA 02142 - +1 877 414 4776 (toll free) www.onehippo.com On 6 January 2012 15:48, Robby Pelssers wrote: > Hi all,**** > > ** ** > > I=92ve been looking at how to add a HTML5 serializer to the project.**** > > ** ** > > So far my investigations have led to add following code to > org.apache.cocoon.sax.component.XMLSerializer**** > > ** ** > > public static XMLSerializer createHTML5Serializer() {**** > > XMLSerializer serializer =3D new XMLSerializer();**** > > ** ** > > serializer.setContentType(TEXT_HTML_UTF_8);**** > > serializer.setDoctypePublic("XSLT-compat");**** > > serializer.setEncoding(UTF_8);**** > > serializer.setMethod(HTML);**** > > ** ** > > return serializer;**** > > }**** > > ** ** > > ** ** > > Using the HTML5 serializer in a test to print the output:**** > > ** ** > > @Test**** > > public void testHTML5Serializer() throws Exception {**** > > ByteArrayOutputStream baos =3D new ByteArrayOutputStream();**** > > ** ** > > newNonCachingPipeline()**** > > .setStarter(**** > > new XMLGenerator("serializer > test

test

")**** > > )**** > > .setFinisher(XMLSerializer.createHTML5Serializer())**** > > .withEmptyConfiguration()**** > > .setup(baos)**** > > .execute();**** > > ** ** > > String data =3D new String(baos.toByteArray());**** > > System.out.println(data);**** > > }**** > > ** ** > > Would print**** > > ** ** > > **** > > **** > > **** > > = **** > > serializer test**** > > **** > > **** > >

test

**** > > **** > > **** > > ** ** > > ** ** > > I read a number of articles describing the issues with serializing html5 > and so far this was the best I could come up with which is not 100% > conforming due to **** > > **=B7 **Non matching doctype although it will not break in the > browser =E0 should be **** > > **=B7 **The charset should be accor= ding to > html5 spec**** > > ** ** > > ** ** > > http://www.contentwithstyle.co.uk/content/xslt-and-html-5-problems/**** > > http://www.w3schools.com/html5/tag_meta.asp**** > > ** ** > > ** ** > > Does anyone have more knowledge on this subject?**** > > ** ** > > Robby **** > > ** ** > > ** ** > --14dae9399d5fdf060f04b5de0371 Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Hey Robby,

which Cocoon version are you using for your p= roject? In C2.1 and C2.2 there's not only a XMLSerializer but also an H= TMLSerializer and XHTMLSerializer for their specific needs. So why not crea= te your own HTML5Serializer?

In HTML5 the specification teams tried to specify what = browsers were already doing instead of making a new theoretical specificati= on. HTML5 should be backwards compatible with previous (X)HTML versions. Th= is is the reason why some old elements are not deprecated but considered ob= solete (remember marquee, it was so cool on Geocities).
The doctype doesn't really matter, browsers=A0generally ignore the= PUBLIC part in the doctype (apart from some hacks in IE going into quirks = mode).=A0
A good presentation about HTML5 is=A0http://vimeo.com/15755349.

Jasha Joachimsthal

Europe - Amsterd= am - Oosteinde 11, 1017 WT Amsterdam -=A0+31(0)20 522 4466
US - Boston - 1 Broadway, Cambridge, MA 02142 -=A0+1 877 414 4776 (toll fre= e)

www.onehippo.com



On 6 January 2012 15:48, Robby Pelssers = <Robby.Pelss= ers@nxp.com> wrote:

Hi all,

=A0

I=92= ve been looking at how to add a HTML5 serializer to the project.<= /u>

=A0

So far m= y investigations have led to add following code to org.apache.cocoon.sax.co= mponent.XMLSerializer

=A0=

=A0=A0=A0 public static XMLSerializer createHTML5Ser= ializer() {

=A0=A0=A0=A0=A0=A0=A0 X= MLSerializer serializer =3D new XMLSerializer();

=A0

=A0=A0=A0=A0=A0=A0=A0 serializer.setContentType(TEXT= _HTML_UTF_8);

=A0=A0=A0=A0=A0=A0=A0= serializer.setDoctypePublic("XSLT-compat");

=A0=A0=A0=A0=A0=A0=A0 serializer.setEncoding(UTF_8);=

=A0=A0=A0=A0=A0=A0=A0 serializer.setMethod(HTML);=

=A0

=A0=A0=A0=A0=A0=A0=A0 return serializer;

=A0=A0=A0 }

= =A0

=A0

Using the HTML5 serializer in a test to print the output:

=

=A0

=A0=A0= =A0 @Test

=A0=A0=A0 public void tes= tHTML5Serializer() throws Exception {

=A0=A0=A0=A0=A0=A0=A0 ByteArrayOutputStream baos =3D= new ByteArrayOutputStream();

=A0

=A0=A0=A0=A0=A0=A0=A0 newNonCachingP= ipeline()

=A0=A0=A0=A0=A0=A0=A0 .setStarter(

= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 new XMLGenerator("<html><head&= gt;<title>serializer test</title></head><body><p= >test</p></body></html>")

=A0=A0=A0=A0=A0=A0=A0 )

=A0=A0=A0=A0=A0=A0=A0 .setFinisher(XMLSerializer.createHTML5Ser= ializer())

=A0=A0=A0=A0=A0=A0=A0 .w= ithEmptyConfiguration()

=A0=A0=A0=A0=A0=A0=A0 .setup(baos)

= =A0=A0=A0=A0=A0=A0=A0 .execute();

<= u>=A0

=A0=A0=A0=A0=A0=A0=A0 String dat= a =3D new String(baos.toByteArray());

=A0=A0=A0=A0=A0=A0=A0 System.out.println(data);

}

=A0

Would print

=A0

<!DOCTYPE html PUBLIC "= XSLT-compat">

<html><= u>

<head>

<META http-equiv=3D"Content-Type" content=3D"text/html; c= harset=3DUTF-8">

<title&= gt;serializer test</title>

&l= t;/head>

<body>

<p>test</p>

</body&= gt;

</html>

=

=A0

=A0

I read a number of articles describing the issues with seria= lizing html5 and so far this was the best I could come up with which is not= 100% conforming due to

=B7=A0=A0=A0=A0=A0=A0=A0=A0 Non matching doctype although it will not break in the browser= =A0 =E0 should be <!DOCTYPE= html>

=B7=A0=A0=A0=A0=A0=A0=A0=A0 The charset should be <meta charset=3D=94UTF-8=94/> accor= ding to html5 spec

=A0

= =A0

http://www.cont= entwithstyle.co.uk/content/xslt-and-html-5-problems/

http://www.w3schools.com/html5/tag_meta.asp

=A0

=A0

Does anyone have more knowledge on this subject?<= /u>

=A0

Robby=A0=A0=A0

=A0

=A0


--14dae9399d5fdf060f04b5de0371--