perl-modperl mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From (Randal L. Schwartz)
Subject mod_perl and utf8 and CGI->param
Date Tue, 02 Sep 2014 21:19:47 GMT

Getting really frustrated with mod_perl2's apparent inability to
probably read UTF8 input.

Here's my mod_perl2 setup:

  Apache 2.2.[something]
  mod_perl 2.0.7 (or nearly that)
  Perl "script" with

Very early in my app:

  ## ensure utf8 CGI params:
  $CGI::PARAM_UTF8 = 1;

  binmode STDIN, ":utf8";
  binmode STDOUT, ":utf8";
  binmode STDERR, ":utf8";

This works fine in CGI mode: when I ask for $foo = $cgi->param('foo'),
DBI::data_string_desc($foo) shows a UTF8 string with the proper
discrepency between bytes and chars.

But when I try to run it under mod_perl, the returned string appears
to be the raw ascii bytes, and definitely not utf8.  Of course, when I
store that in the database (using DBD::Pg), the "latin-1" is encoded
to "utf-8", and I get a bunch of weird chars on the output.

Has anyone managed to round-trip UTF8 from form to database and back
using a setup similar to this?

I suspect part of the problem is this in

    'read_from_client' => <<'END_OF_FUNC',
    # Read data from a file handle
    sub read_from_client {
    my($self, $buff, $len, $offset) = @_;
    local $^W=0;                # prevent a warning
    return $MOD_PERL
        ? $self->r->read($$buff, $len, $offset)
            : read(\*STDIN, $$buff, $len, $offset);

Since I binmode STDIN, the non-$MOD_PERL works ok here.  What's the
equivalent of $r->read() that marks the incoming stream as UTF8, so I
get chars instead of bytes?  Or can I just read(\*STDIN) in mod_perl2
as well? (I know that was supported at one point...)

Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<> <URL:>
Perl/Unix consulting, Technical writing, Comedy, etc. etc.
Still trying to think of something clever for the fourth line of this .sig

View raw message