cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Antonio Gallardo" <agalla...@agsoftware.dnsalias.com>
Subject RE: Sudden difference in interpretation of #160 - bug?
Date Tue, 06 Jan 2004 06:10:13 GMT
Hi:

I think your problem is related to file encoding. Please note it is not
enough to change a tag in the beginning of a XML file to "tell" your text
editor that your are changing the file encoding of your file.

I had similar problems before. I would recommend you to check the encoding
used by your editor. And check the really encoding of the files.

If this might help, I recommend you jEdit (http://www.jedit.org):

This editor allows you to change between file encoding as well as
line-ending. Also has support for XML editing. This is my prefered editor
for XML files.

Antoher way to find more info about this problem are old post in the users
mailarchive. Search for keywords: "umlaut", "UTF" or "ISO 8859" There is
very interesting info about this topic. This was one of the first errors I
have when starting using Cocoon.

I hope this would help you.

Best Regards,

Antonio Gallardo.

H.vanderLinden@MI.unimaas.nl dijo:
> Right, so far I came up with this:
>
> in the resulting source using 'serialize html' the NO-BREAK-SPACE shows up
> as &nbsp; (i.e. the string, not the character), using 'serialize xhtml' it
> shows up as &Acirc (the character) followed by &nbsp; (the string).
>
> Originally my XSL file added a metatag ...iso-8859-1, but I removed this.
> No
> effect. Close examination of the code:
> Internet Explorer 6.0 SP2 adds a metatag ...iso-8859-1 (when I changed the
> metatag in the XSL file to UTF-8 IE6 changed it to ISO-8859-1). I haven't
> found any setting in IE6 where I could change the encoding.
> Opera 7.21 displays the same results, whether with or without the metatag.
> In Opera I set the option 'encoding to assume for pages lacking encoding'
> to
> utf-8, but still the same result.


>
> I use both browsers on my home pc too with the same results (I cannot say
> whether they are the exact same versions, but it is IE6.X and Opera 7.X)
>
>
>> -----Original Message-----
>> From: Marc Portier [mailto:mpo@outerthought.org]
>> Sent: Monday, 05 January 2004 12:01
>> To: users@cocoon.apache.org
>> Subject: Re: Sudden difference in interpretation of #160 - bug?
>>
>>
>>
>>
>> H.vanderLinden@MI.unimaas.nl wrote:
>>
>> > Hi,
>> >
>> > thanks. I've read the article, in fact I read the entire
>> thread, but either
>> > New Year's wine is still in my system (I don't drink :-))
>> or I've stumbled
>> > onto a configuration problem/bug in <map:serialize type='xhtml'/>
>> >
>> > Point is this: whether I enter &#160; or &#xA0; directly in my XSL
>> > stylesheet or through an entity reference
>> > <!ENTITY nbsp "&#160;"> (or the hex code), when I tell the
>> serializer to use
>> > type xhtml I get an &Acirc; instead of &nbsp; when I change
>> the type back to
>>
>> I take it you refer to &Acirc; just to communicate here, and
>> that it is
>> not actually in the resulting file, right?
>>
>> That would be the most interesting thing to see now: what
>> actually is in
>> that file (and not: what is showing up in the browser)
>
> How do I do this? I've tried a cocoon-view, but it shows the resulting
> page.
> And there is no start page. It starts out as an aggregation of xml files
> from various sources which are finally processed by an XSL file that
> transforms XML into (x)html. I've just tried adding a source:write and the
> result is that &nbsp; is represented by A0 characters.
>
>> When the A with ^ shows up it often means the browser
>> received UTF-8 but
>> thinks it is iso-8859-1 any way... as it happens to be (no need to go
>> into the depths of how the encoding works) some of the characters in
>> utf-8 are encoded with more then one byte, in which case the leading
>> byte is not uncommon to map to latin-1's A with something
>> range (just so
>> you recognise the disease)
>
> This is exactly what is happening here: all "special" characters like
> &nbsp;
> , &copy; and &raquo; are prefixed with a &Acirc; Only &nbsp; shows up
as
> string, the rest as the character.
>
>> could you check the encoding your browser is assuming? and check that
>> with the page or http-headers were saying?
>
> How? As said above: IE6 adds/modifies the metatag to iso-8859-1. Opera
> should use utf-8, but still shows the &Acirc;
>
>> > 'html' it works as expected.
>> >
>>
>> can be a number of reasons:
>> - I think by default html serializer is set to use iso-8859-1
>> as target encoding?
>
> I wouldn't know. I've checked the Cocoon sitemap.xmap (default Cocoon
> 2.1.3)
> and it has no encoding info for the htmlserializer.
>
>> - I think the html serializer (from xalan) would be
>> introducing the &nbsp;
>
> Must be, since the resulting source shows &nbsp; (the string) for all
> instances, while other characters such as &copy; and &raquo; show up as
> the
> character.
>
> Bye, Helma
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
> For additional commands, e-mail: users-help@cocoon.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Mime
View raw message