cocoon-docs mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject [DAISY] Updated: HTMLSerializer
Date Wed, 24 Sep 2008 15:44:58 GMT
A document has been updated:

Document ID: 896
Branch: main
Language: default
Name: HTMLSerializer (unchanged)
Document Type: Sitemap Component (unchanged)
Updated on: 9/24/08 3:44:37 PM
Updated by: David Legg

A new version has been created, state: draft


Long description
This part has been added.
Mime type: text/xml
File name: null
Size: 2873 bytes

<p>The HTMLSerializer is used to render the final output of a Cocoon pipeline as
HTML; suitable for a web browser to understand.</p>

<p>Though HTML and XML look similar there are a number of subtle differences. 
The standard HTMLSerializer actually delegates the job of transformation to the
JAXP (Java API for XML Parsing) 'html' output method.  By default in Cocoon this
is implemented by the Xalan processor (though other XSLT processors can be
used).  This method performs the following actions: -</p>

<li>Certain empty tags are not closed.  For example &lt;br/&gt; or
&lt;br&gt;&lt;/br&gt; elements will be output as &lt;br&gt;.  For
HTML 4.01 the
empty elements are: <em>area</em>, <em>base</em>, <em>basefont</em>
br</em> , <em>col</em>,<em> frame</em>, <em>hr</em>
, <em>img</em>,
<em>input</em>, <em>isindex</em>, <em>link</em>, <em>meta
<em>param</em>.  Note that some of these tags are deprecated in some versions
<li>Tags are considered to be case insensitive.  Therefore &lt;br/&gt; or
&lt;BR&gt;&lt;/BR&gt; or &lt;Br&gt;&lt;/Br&gt; will all be
recognized as the
html br tag and be output as &lt;br&gt; with no end tag.</li>
<li>Any content between <em>script </em>or <em>style </em>tags
is not escaped. 
For example: &lt;script&gt; if (a &amp;lt; b) foo()&lt;/script&gt; will
output as: &lt;script&gt;if (a &lt; b) foo()&lt;/script&gt;</li>
<li>Attribute values containing '&lt;' characters are not escaped.</li>
<li>Boolean attributes are output in shortened form.  For example: &lt;option
selected="selected"&gt; is output as: &lt;option selected&gt;</li>

<p>More details can be found by consulting the
<a href="">W3C XSL
Transformations (XSLT)</a> reference document.</p>

<p class="warn">Though the HTMLSerializer goes a long way to helping create HTML
output, it does not guarantee to create a fully conformant HTML document from
whatever source it is fed.  For example, if you specify the output should be
strict HTML 4.01 and then your input contains: &lt;img src="pic.jpg"
align="right"/&gt; the align attribute will still be output even though it is
deprecated in strict HTML 4.01.</p>


<p>By default an HTMLSerializer is assigned as 'html' and configured to produce
HTML 4.01 'loose' which is also known as 'HTML 4.01 Transitional'.  This doctype
allows the use of both structural, semantic and presentational elements (e.g.
font) but not framesets.  Using the default serializer is as easy as adding the
following to your sitemap:</p>

<pre>&lt;map:pipeline id="demo"&gt;
  &lt;map:match pattern="*.html"&gt;
    &lt;map:generate src="page.xml"/&gt;
    <strong>&lt;map:serialize type="html"/&gt;</strong>



View raw message