cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leon Widdershoven <...@dds.nl>
Subject Re: Transofrmer from Invalid HTML to XHTML or XML
Date Thu, 22 Apr 2004 19:50:04 GMT
XML: All tags must be closed. All elements must be known.

I think you can put stuff like &nbsp; and other character codes in a DTD 
(not sure though,
but ought to work) as they are not XML nodes.

<IMG ...> though will never ever be XML. The appropriate string (xhtml) is
<IMG ... />, and the browsers I use all accept such an image tag.

You really do need to use a HTMLGenerator if you want to process html, 
cuase it must
be XML and HTML is just not strict enough for XML.

If you do not want your documents to be served that way you can use a 
serializer which
serializes to HTML (not XHTML). Don't know if a serializer is available 
which removes
things like </img> and </input> though. An transformer is naturally not 
good enough, as that
by definition transforms xml to another xml.

Leon

Upayavira wrote:

> laurent_rorive@marinepower.com wrote:
>
>>
>> Dear Upayavira,
>>
>> this is not a question of generator but more a question of serializer 
>> I think
>
>
> Well, what you want to do is use the HTMLGenerator to read in the 
> HTML, and then serialize it as XML or as XHTML. You can't serialize 
> HTML directly, as you can't use it within a pipeline because it isn't 
> valid XML. Therefore you convert it to valid XML with the 
> HTMLGenerator, then serialize it.
>
> Hope that makes sense.
>
> Upayavira
>
>>
>>
>>
>>
>>     *Upayavira <uv@upaya.co.uk>*
>>
>> 22/04/2004 10:42
>> Please respond to users
>>
>>                    To:        users@cocoon.apache.org
>>         cc:                Subject:        Re: Transofrmer from 
>> Invalid HTML to XHTML or XML
>>
>>
>>
>>
>> laurent_rorive@marinepower.com wrote:
>>
>> >
>> > Dear Members,
>> >
>> > I have some HTML with special characters as &nbsp;  <IMG  > .... that
>> > I want to save as valid XML or XHTML.
>>
>> Look at the HTMLGenerator in the HTML block.
>>
>> Upayavira
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
>> For additional commands, e-mail: users-help@cocoon.apache.org
>>
>>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
> For additional commands, e-mail: users-help@cocoon.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Mime
View raw message