httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benson Margulies <ben...@basistech.com>
Subject Proposals for Improvements in International Character Support
Date Mon, 07 Feb 2000 13:36:20 GMT
Dean Gaudet asked me to forward the following to the full list. I/Basis
would like to make a contribution to the Apache effort in the area of
charset management. Feedback to this will help me find something that is
actually of useful to pitch in upon.


Dean,

There are two ideas in tension here. On the one hand, it is more reliable
and convenient to derive charset from content, since it saves the trouble of
managing file names in ways that are not common. Unlike, say, mime types,
for which the obvious file suffix conventions are ubiquitous. Of course, I
say that from the context of owning some rather complex, and as yet not
open-source, technology for detecting charset.

On the other hand, even if the contents were available for mmap to avoid any
extra cost  in the 'open' and 'read' department, there's got to be a CPU
time cost of deriving type on the fly.

Thus, I am led to the following notion: a tool that you could run over a
hierarchy that would precompute full mime types, including charset, and
store them ?in a dbm file? for rapid access at runtime. This could be a new
function in the module vector: a function to call from some executable or
another scanning a tree and sorting out all the types.

All in all, I think that I need to follow the ongoing flow of discussion in
this general area for a while before I launch off into proposing changes,
even for 2.x. I am particularly curious as to whether the mmap-ish work is
likely to lead to the routine mapping of file contents into memory at the
head of the module chain where other modules could avail themselves, on the
cheap, of the bytes.

By the way, I did propose a patch to make type_checker a 'runall' function,
the Rodent of Unusual Size pointed out that the value of the change didn't
appear to outweigh the doc impact. Given the availability of fixup as an
alternative venue, I saw no reason to argue.

Thanks for taking the time to help a newcomer navigate the space,

Benson

Mime
View raw message