perl-modperl mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John N. Brahy" <jbr...@ad2.com>
Subject RE: is there a way to force UTF-8 encoding
Date Fri, 03 Mar 2006 22:06:56 GMT
> -----Original Message-----
> From: Christopher H. Laco [mailto:claco@chrislaco.com]
> Sent: Friday, March 03, 2006 12:53 PM
> To: claco@chrislaco.com
> Cc: John N. Brahy; modperl@perl.apache.org
> Subject: Re: is there a way to force UTF-8 encoding
> 
> Christopher H. Laco wrote:
> > Christopher H. Laco wrote:
> >> John N. Brahy wrote:
> >>>> -----Original Message-----
> >>>> From: Christopher H. Laco [mailto:claco@chrislaco.com]
> >>>> Sent: Friday, March 03, 2006 12:28 PM
> >>>> To: John N. Brahy
> >>>> Cc: modperl@perl.apache.org
> >>>> Subject: Re: is there a way to force UTF-8 encoding
> >>>>
> >>>> John N. Brahy wrote:
> >>>>> Is there a way to force UTF-8 encoding? I have tried
> >>>>>
> >>>>> AddDefaultCharset utf-8 in the httpd.conf
> >>>>>
> >>>>> OS: OpenBSD
> >>>>> Apache: Apache/1.3.29 (Unix) mod_perl/1.29 mod_ssl/2.8.16
> OpenSSL/0.9.7g
> >>>>>
> >>>>> But
> >>>>> 1) wget -S says it's Content-Type: text/html; charset=ISO-8859-1
> >>>>> 2) when I try the HTML validator on w3c.org it tells me that it's
> >>>>> ISO-8859-1
> >>>>> 3) Internet Explorer and Firefox both have ISO-8859-1 selected
> >>>>> 4) Firefox's Page Info shows it as ISO-8859-1
> >>>>>
> >>>>> Anybody know a way to force it to utf-8?
> >>>> Are there actually any UTF-8 encoded characters in the output?
> >>>> If their aren't any, then the document can really be both encodings
> at
> >>>> the same time, unless of course the document also includes a BOM
> (Byte
> >>>> Order Marker).
> >>>>
> >>>> -=Chris
> >>> Yes, it's a Spanish site that I'm developing for Verizon and they have
> characters that show up incorrectly.
> >>>
> >>> http://www.verizonnoticias.com/
> >>>
> >>> We've done most everything to encode into HTML entities but our client
> will need to copy and paste from MS word so they will definitely have more
> of these characters. Sometimes they show up as boxes and sometimes they
> show up as this character à even though it's actually a &ntilde;
> >>>
> >>>
> >> If you view the page in Firefox, and then manually select the UTF-8
> >> encoding from View -> Character Encoding -> Unicode(UTF-8) ...does the
> >> page then display correctly?
> >>
> > For me, in Firefox 1.5.0.1, It indeed loads as ISO-8859-1 Latin1. I
> > don't notice and questionable characters on the page.
> >
> > If I tell firefox to use UTF-8, it looks the same for me.
> >
> > -=Chris
> >
> Also, subpages, like http://www.verizonnoticias.com/News/Article/272/
> work just fine and come through as UTF-8

It's the homepage that is giving me problems. And now we just added HTML::Entities to the
CMS and our content person just pasted HTML into the field so we are now seeing "Espa&ntilde;ol
" with the &ntilde; being escaped. 
So now I'm off in another direction trying to find out how to only encode the characters that
are giving us problems. I think I'm going to have to write a custom encoder to only encode
the special characters. 

Thanks for your time and help,

John

Mime
View raw message