httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Trawick <trawi...@bellsouth.net>
Subject Re: [PATCH] experimental module mod_charset_lite
Date Fri, 26 May 2000 12:24:22 GMT
> From: "William A. Rowe, Jr." <wrowe@lnd.com>
> Date: Fri, 26 May 2000 01:08:03 -0500
> Content-Type: text/plain;
> 	charset="iso-8859-1"
> X-Priority: 3 (Normal)
> X-MSMail-Priority: Normal
> X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6600
> Importance: Normal
> X-Spam-Rating: locus.apache.org 1.6.2 0/1000/N
> X-UIDL: a1eda87b7691c0e02172200af9ae5dee
> 
> > From: Greg Stein [mailto:gstein@lyra.org]
> > Sent: Thursday, May 25, 2000 11:54 PM
> > 
> > On Thu, 25 May 2000, Jeff Trawick wrote:
> > >...
> > > Any comments for improving it?
> > 
> > Why does the administrator specify the target character set? 
> > Isn't that a
> > function of some headers that the browser is supposed to send?
> 
> I would -presume- we are talking about a default, and that it
> will apply to ignorant browsers that we can't negotiate with.
> We should specify the worldwide default is ISO8859-1, per
> HTTP/1.0, but I guess this was the rest of the world's solution.
> 
As far as a world-wide default of "ISO8859-1"...  That particular
string works great on OS/390 and Solaris but not with glibc table
names (as in RedHat 6.1).  It seems that glibc uses a more proper
name, but then I would take issue with their name for the normal
OS/390 UNIX character set :)  (Interestingly, there are files in glibc
with the name ISO8859-1, but they don't seem to be acceptable by
iconv(1).

As you know, this is a !@#$ing  mess.  You touched on this a few
weeks ago when you suggested using certain registered numbers for the
names.  I've kept it in the back of my mind but haven't decided what
to do with the suggestion yet.  Unfortunately, hardly anybody knows
that such numbers exist; in particular, iconv() implementations (or
rather their table names) don't deal with such numbers (AFAIK) and
browsers don't deal with such numbers either.

Currently, CharsetDefault needs to be set to whatever APR needs, and
existing mod_mime directives are used to tell the browser what we are
sending it.

I guess that CharsetDefault should default to whatever form of
ISO8859-1 is accepted by APR translation.  This needs to be
[auto]configured and stuck in apr.h as APR_ISO_8859_1, with values
like "ISO8859-1" (OS/390, Solaris), 

> Are all these parameters per directory (and perhaps, sometime
> in the future, per file?)

It is definitely per directory (tested) and hopefully is per file
if Location is used (untested as of yet :) ).  Of course, folks are
going to want to do it by file extension too, but I'm not in a big
hurry on such niceties because they are so clearly module problems
that they don't help me gain confidence that Apache core is doing the
right thing.

> 
> > > Any objections to shipping it in the experimental directory?
> > 
> > Not at all.
> 
> Nor here!  Glad to see it... I'm wondering, have you inked off
> an email to the Russian apache folks yet to let them know we are
> finally acknowledging their plight and looking at building some
> of this code into the core distribution with 2.0?

No; I do need thank them for providing guidance with their existing
implementation as well as to alert them to some of the key aspects in
2.0, particularly those that differ greatly from their design:

1) following the precedent for how EBCDIC translation was done in 1.3, 
   generic translation is done in buff operations (ignoring some
   translation support in utility routines, of course)

   In Russian Apache there is translation logic sprinkled all over the 
   place (generally around calls to the buff operations).

   I doubt that the 2.0 implementation would be alarming to them.

2) translation support is in APR; Apache tries to avoid as many
   details as possible

   Russian Apache provides its own implementation of translation
   tables and Apache configuration is used to point to the translation
   table files.  They would hopefully want to avoid this if they
   rebased on 2.0, and instead provide a non-ICONV implementation in
   the APR translation code.  Alternatively, they might determine that
   in the last couple of years iconv support on their target platforms
   has arrived and works well and instead they would ship tables that
   work with iconv.

   Our 2.0 implementation has many configuration implications for
   them.

3) we don't pretend to solve all the problems solved by Russian
   Apache, but hopefully most/all of the ignored/postponed problems
   are in module space

Thanks for your comments,

Jeff

-- 
Jeff Trawick | trawick@ibm.net | PGP public key at web site:
     http://www.geocities.com/SiliconValley/Park/9289/
          Born in Roswell... married an alien...

Mime
View raw message