Return-Path: Delivered-To: apmail-httpd-bugs-archive@www.apache.org Received: (qmail 4003 invoked from network); 12 Apr 2007 17:13:18 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 12 Apr 2007 17:13:18 -0000 Received: (qmail 80031 invoked by uid 500); 12 Apr 2007 17:13:24 -0000 Delivered-To: apmail-httpd-bugs-archive@httpd.apache.org Received: (qmail 79982 invoked by uid 500); 12 Apr 2007 17:13:23 -0000 Mailing-List: contact bugs-help@httpd.apache.org; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: Reply-To: "Apache HTTPD Bugs Notification List" List-Id: Delivered-To: mailing list bugs@httpd.apache.org Received: (qmail 79969 invoked by uid 99); 12 Apr 2007 17:13:23 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Apr 2007 10:13:23 -0700 X-ASF-Spam-Status: No, hits=-98.6 required=10.0 tests=ALL_TRUSTED,INFO_TLD,NO_REAL_NAME X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Apr 2007 10:13:16 -0700 Received: by brutus.apache.org (Postfix, from userid 33) id 5D51C714071; Thu, 12 Apr 2007 10:12:56 -0700 (PDT) From: bugzilla@apache.org To: bugs@httpd.apache.org Subject: DO NOT REPLY [Bug 42105] New: - Patch for mod_autoindex to set the character set Message-ID: X-Bugzilla-Reason: AssignedTo Date: Thu, 12 Apr 2007 10:12:56 -0700 (PDT) X-Virus-Checked: Checked by ClamAV on apache.org DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG� RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT . ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND� INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=42105 Summary: Patch for mod_autoindex to set the character set Product: Apache httpd-2 Version: 2.3-HEAD Platform: All OS/Version: other Status: NEW Severity: normal Priority: P2 Component: mod_autoindex AssignedTo: bugs@httpd.apache.org ReportedBy: poeml@suse.de [Summarizing from the dev list here. See http://marc.info/?l=apache-httpd- dev&m=117027634505806&w=2 and following posts.] Users have a problem with directory listings generated by mod_autoindex: It is not possible to control the character setting which which the response is marked. The server cannot know what the real encoding on disk is, it decides on a very rough guess based on the OS it is running on: APR_HAS_UNICODE_FS, which is, as far (as little) as I looked, 1 on Windows, and 0 on Linux. Depending on it, mod_autoindex decides whether to add a (fixed) charset to the content type: #if APR_HAS_UNICODE_FS ap_set_content_type(r, "text/html;charset=utf-8"); #else ap_set_content_type(r, "text/html"); #endif Thing is, that Linux uses filesystems that encode UTF-8 since ages, and since a system-wide UTF-8 locale is becoming more and more widespread, filenames encoded as such are occurring much more frequently. This means, that on many servers the content type needs to be set appropriately, so the browser can display things correctly. My first thought was to define APR_HAS_UNICODE_FS to 1, but that could be just as wrong; it only means that the filesystem is unicode capable but not that the actual filenames happen to be encoded like that. Instead, it only depends on site specific needs. Thus, I think the right way is to make the character set configurable. I am attaching a patch which adds a "AddDirectoryIndexCharset" directive to the mod_autoindex configuration. The patch actually removes the dependency on APR_HAS_UNICODE_FS. My train of thought here is that utf-8 can (and should) be the default, unless configured otherwise. This fits Windows (it has always been like that), and it (largely) fits Linux. But I don't know about other platforms. On Thu, Feb 01, 2007 at 11:13:38AM -0600, William A. Rowe, Jr. wrote: > Dr. Peter Poeml wrote: > > On Thu, Feb 01, 2007 at 10:59:46 +0000, Joe Orton wrote: > >> On Wed, Jan 31, 2007 at 09:45:12PM +0100, Dr. Peter Poeml wrote: > >>> Users have a problem with directory listings generated by mod_autoindex: > >>> It is not possible to control the character setting which which the > >>> response is marked. > >> AddDefaultCharset does allow this already as you mention in the bug. > >> Can't users who insist on using filenames using one encoding and file > >> content using another simply use: > >> > >> AddDefaultCharset UTF-8 > >> AddCharset ISO-8859-1 .html > >> > >> or similar? > > > > I don't think so, because it means > > 1) that all .html files would need to be ISO-8859-1 > > 2) you cannot have files with charset=somethingelse anymore > > 3) all non-html files would need to be UTF-8 then, unless you add > > AddCharset directives for all of them... > > And you can't match by name. I'm reviewing the patch, but I'll already > offer a +1 on the concept. On Thu, Feb 01, 2007 at 10:01:52PM +0100, Ruediger Pluem wrote: > In the general case I agree with Joe that if things can be done with existing > directives / code, no new directives / code should be added, but this case here > is different. > > I think it is the ultimate duty of the content generator to set the correct > content type / encoding. So in this case this would be mod_autoindex. Whether > mod_autoindex detects this automatically or has a directive to set this is another > story. Currently I would be in favour of a directive provided that there is > no reliable and performant autodetection mechanism. > > From my point of view AddDefaultCharset and AddCharset should be used to > > - configure the "core content generator" of httpd (serving static files) > - help fixing broken content generators who cannot set the encoding correctly > by themselves > > So +1 on the general concept. Cool. Here is the patch against trunk, with documentation added. I hope I got the way of patching the documentation right. A review would be very much appreciated. Thanks, Peter -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. --------------------------------------------------------------------- To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org For additional commands, e-mail: bugs-help@httpd.apache.org