perl-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pavel V." <>
Subject ModPerl::RegistryCooker default_handler returned status
Date Sun, 31 Aug 2014 11:50:18 GMT

  Hi all.

  (sorry for unfinished letter I sent before)

  Can anybody explain historical reasons of ModPerl::RegistryCooker default_handler() implementation?
  We can see the following piece of code there:

    # handlers shouldn't set $r->status but return it, so we reset the
    # status after running it
    my $old_status = $self->{REQ}->status;
    my $rc = $self->run;
    my $new_status = $self->{REQ}->status($old_status);
    return ($rc == Apache2::Const::OK && $old_status != $new_status)
        ? $new_status
        : $rc;

  The goal of ModPerl modules is "Run unaltered CGI scripts under mod_perl".
  If scripts are unaltered, how they can change $r->status? I think, answer should be:
'in no way'.
  Why we need to reset status then? I see no reasons. But this solution makes us impossible
to set
  r->status from 'altered' scripts correctly and produces strange effects by headers parser
  responses > 8k size.

  Did anyone can explain, why this code appear and for which scenarios it come?

  Apache's mod_cgi.c module returns OK regardless of the r->status value, which was set
from script output.
  Why ModPerl handler should behave differently?

  One good example about problems, introduced by this solution, is 404 status.
  Let`s look into small script:

  #!/usr/bin/perl -w
  use strict;
  use CGI;

  my $q = CGI->new;
  #### Variant 1
  print $q->header(-charset => "windows-1251", -type => "text/html", -status=>'404
Not Found');
  #### Variant 2
  #print $q->header(-charset => "windows-1251", -type => "text/html", -status=>'404
Not Found');
  #print 'BIG RESPONSE:' . '*' x 8192;

  Under CGI it prints only "SMALL/BIG RESPONSE" strings with 404 status code into browser.
  Under mod_perl, browser get status 200 instead of 404 and script response content is appended
  default Apache error-handler content (like 'Status: OK \n /path/to/ was not found
on this
  server') for 'small response'.
  If we print 'big response', then browser get status 404 and apache error-handler appended
  changes to 'Not Found \n The requested URL /perl/ was not found on this server.'.

  The reasons for such behavior is what script uses which detects mod_perl.
  So, these scripts are practically 'altered' and can set $r-status inside of them. After
  executes, we have r->status == 404. But ModPerl::RegistryCooker set r->status to 200
back and return
  404 as handler status - so apache ErrorDocument directive begin to work.
  (under mod_cgi handler status is OK, so no error processing occurs).
  Also, due to r->status changed, we can get 200 response code in browser if response was
not send
  Most interesting thing, what we get 404 status logged into Apache access log anyway, so
it is harder to
  detect this problem.

  Ok, maybe this is completely problem? Let's look into next example - fully unaltered

  #!/usr/bin/perl -w
  use strict;
  print "Status: 404 Not Found\n";
  print "Content-Type: text/html; charset=windows-1251\n\n";
  #Small response
  print 'NOTFOUND:' . '*' x 81;
  #Big response
  #print 'NOTFOUND:' . '*' x 8192;

  When response is small, then 8k buffer is not filled, and r->status does not changed
  before cgi headers parsed on buffer flush. So, we get r->status == 200, and handler status
is 200
  too. After headers are parsed, r->status become 404, but mod_perl handler returns status
200 to

  When response is big, then 8k buffer is filled and flushed before handler is done.
  So, we have r->status == 404, and handler status is 404 too. As result, we have ErrorDocument
  directive again.

  Browser get 404 status in all these cases, I don't understand completely, how this is processed.

  If we enable mod_deflate apache module, all things become ever more interesting.
  When small response - all shows as before.
  When big response - only apache error page displayed, no content from script output displayed.

  With mod_cgi we see output from script all the time, with or without mod_deflate.

  So, ModPerl::* modules does not fully reach their goal of "Run unaltered CGI scripts under

  As follows from my analysis of mod_perl, mod_cgi and other httpd core, we need no checks
for new
  r->status value after script run at all. Due to mod_perl handler status not in (OK,DECLINED,DONE)
  for all r->statuses != 200, ap_die() is always called instead of ap_finalize_request_protocol()
  ap_process_request(). Then all responses with non-200 statuses are processed through
  ErrorDocument, if exists for that code, and then ap_send_error_response() is called for
them. This
  is wrong way, I think.  'Altered' scripts have no practical use for $r->status($newValue)
  current code state.

  So, I propose to patch ModPerl::* modules and they would run unaltered/altered scripts more

-    # handlers shouldn't set $r->status but return it, so we reset the
-    # status after running it
-    my $old_status = $self->{REQ}->status;
-    my $rc = $self->run;
-    my $new_status = $self->{REQ}->status($old_status);
-    return ($rc == Apache2::Const::OK && $old_status != $new_status)
-        ? $new_status
-        : $rc;
+    return $self->run;

   If altered script needs to set new handler status by doing 'return $newHandleStatus', sub
run() should
   be altered to get value, returned from eval {}; But I think, all who needs this wrote
   their module handlers already ;-)

   Thanks for your attention.
   I understand what it is too hard to make such global changes and there is not so much people
   interested in this. So, please look to my letter as a question about help in historical
research ;-)

   Let me remind you my question:
   Did anyone can explain, why this code appears and for which scenarios it come?

   If anybody is interested, I propose to discuss this.



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message