perl-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Perrin Harkins <per...@elem.com>
Subject Re: ModPerl::RegistryCooker default_handler returned status
Date Sun, 31 Aug 2014 15:50:32 GMT
Hi Pavel,

You might get more interesting answers on the "users" mailing list, but
there's lots of discussion about this in the archives.  For example:
http://www.gossamer-threads.com/lists/modperl/modperl/45559?search_string=registry%20status;#45559
http://www.gossamer-threads.com/lists/modperl/modperl/88690?search_string=registry%20status;#88690

I think the gist of it is that some people make their Registry scripts set
$r->status because Registry doesn't scan the script output for a status
header like mod_cgi does.  However, that means Apache doesn't know what the
real status is.  In order to tell Apache what the status is, Registry
checks to see if you set $r->status and uses that as the return code, which
may affect what status Apache decides to send.

I may be reading your use case incorrectly, but you seem to be trying to
tell the server you have a 200 OK but really return a 404.  That's
discussed a little bit here:
http://perl.apache.org/docs/2.0/api/Apache2/RequestRec.html#C_status_.
 That is somewhat sophisticated for a Registry script, but you can always
just make your own subclass and override that method in RegistryCooker to
do what you want.  It's intended to be customizable in that way.

- Perrin



On Sun, Aug 31, 2014 at 7:50 AM, Pavel V. <pavel2000@ngs.ru> wrote:

>
>   Hi all.
>
>   (sorry for unfinished letter I sent before)
>
>   Can anybody explain historical reasons of ModPerl::RegistryCooker
> default_handler() implementation?
>   We can see the following piece of code there:
>
>     # handlers shouldn't set $r->status but return it, so we reset the
>     # status after running it
>     my $old_status = $self->{REQ}->status;
>     my $rc = $self->run;
>     my $new_status = $self->{REQ}->status($old_status);
>     return ($rc == Apache2::Const::OK && $old_status != $new_status)
>         ? $new_status
>         : $rc;
>
>   The goal of ModPerl modules is "Run unaltered CGI scripts under
> mod_perl".
>   If scripts are unaltered, how they can change $r->status? I think,
> answer should be: 'in no way'.
>   Why we need to reset status then? I see no reasons. But this solution
> makes us impossible to set
>   r->status from 'altered' scripts correctly and produces strange effects
> by headers parser on
>   responses > 8k size.
>
>   Did anyone can explain, why this code appear and for which scenarios it
> come?
>
>   Apache's mod_cgi.c module returns OK regardless of the r->status value,
> which was set from script output.
>   Why ModPerl handler should behave differently?
>
>   One good example about problems, introduced by this solution, is 404
> status.
>   Let`s look into small script:
>
>   #!/usr/bin/perl -w
>   use strict;
>   use CGI;
>
>   my $q = CGI->new;
>   #### Variant 1
>   print $q->header(-charset => "windows-1251", -type => "text/html",
> -status=>'404 Not Found');
>   print "SMALL RESPONSE";
>   #### Variant 2
>   #print $q->header(-charset => "windows-1251", -type => "text/html",
> -status=>'404 Not Found');
>   #print 'BIG RESPONSE:' . '*' x 8192;
>
>   Under CGI it prints only "SMALL/BIG RESPONSE" strings with 404 status
> code into browser.
>   Under mod_perl, browser get status 200 instead of 404 and script
> response content is appended by
>   default Apache error-handler content (like 'Status: OK \n /path/to/
> script.pl was not found on this
>   server') for 'small response'.
>   If we print 'big response', then browser get status 404 and apache
> error-handler appended content
>   changes to 'Not Found \n The requested URL /perl/test.pl was not found
> on this server.'.
>
>   The reasons for such behavior is what script uses CGI.pm which detects
> mod_perl.
>   So, these scripts are practically 'altered' and can set $r-status inside
> of them. After script
>   executes, we have r->status == 404. But ModPerl::RegistryCooker set
> r->status to 200 back and return
>   404 as handler status - so apache ErrorDocument directive begin to work.
>   (under mod_cgi handler status is OK, so no error processing occurs).
>   Also, due to r->status changed, we can get 200 response code in browser
> if response was not send
>   yet.
>   Most interesting thing, what we get 404 status logged into Apache access
> log anyway, so it is harder to
>   detect this problem.
>
>   Ok, maybe this is completely CGI.pm problem? Let's look into next
> example - fully unaltered script:
>
>   #!/usr/bin/perl -w
>   use strict;
>   print "Status: 404 Not Found\n";
>   print "Content-Type: text/html; charset=windows-1251\n\n";
>   #Small response
>   print 'NOTFOUND:' . '*' x 81;
>   #Big response
>   #print 'NOTFOUND:' . '*' x 8192;
>
>   When response is small, then 8k buffer is not filled, and r->status does
> not changed internally
>   before cgi headers parsed on buffer flush. So, we get r->status == 200,
> and handler status is 200
>   too. After headers are parsed, r->status become 404, but mod_perl
> handler returns status 200 to
>   apache.
>
>   When response is big, then 8k buffer is filled and flushed before
> handler is done.
>   So, we have r->status == 404, and handler status is 404 too. As result,
> we have ErrorDocument
>   directive again.
>
>   Browser get 404 status in all these cases, I don't understand
> completely, how this is processed.
>
>   If we enable mod_deflate apache module, all things become ever more
> interesting.
>   When small response - all shows as before.
>   When big response - only apache error page displayed, no content from
> script output displayed.
>
>   With mod_cgi we see output from script all the time, with or without
> mod_deflate.
>
>   So, ModPerl::* modules does not fully reach their goal of "Run unaltered
> CGI scripts under
>   mod_perl".
>
>   As follows from my analysis of mod_perl, mod_cgi and other httpd core,
> we need no checks for new
>   r->status value after script run at all. Due to mod_perl handler status
> not in (OK,DECLINED,DONE)
>   for all r->statuses != 200, ap_die() is always called instead of
> ap_finalize_request_protocol() at
>   ap_process_request(). Then all responses with non-200 statuses are
> processed through
>   ErrorDocument, if exists for that code, and then
> ap_send_error_response() is called for them. This
>   is wrong way, I think.  'Altered' scripts have no practical use for
> $r->status($newValue) in
>   current code state.
>
>   So, I propose to patch ModPerl::* modules and they would run
> unaltered/altered scripts more
>   correctly:
>
> -    # handlers shouldn't set $r->status but return it, so we reset the
> -    # status after running it
> -    my $old_status = $self->{REQ}->status;
> -    my $rc = $self->run;
> -    my $new_status = $self->{REQ}->status($old_status);
> -    return ($rc == Apache2::Const::OK && $old_status != $new_status)
> -        ? $new_status
> -        : $rc;
> +    return $self->run;
>
>    If altered script needs to set new handler status by doing 'return
> $newHandleStatus', sub run() should
>    be altered to get value, returned from eval {}; But I think, all who
> needs this wrote
>    their module handlers already ;-)
>
>    Thanks for your attention.
>    I understand what it is too hard to make such global changes and there
> is not so much people
>    interested in this. So, please look to my letter as a question about
> help in historical research ;-)
>
>    Let me remind you my question:
>    Did anyone can explain, why this code appears and for which scenarios
> it come?
>
>    If anybody is interested, I propose to discuss this.
>
>    Thanks.
>
> --
> Regards,
>  Pavel                          mailto:pavel2000@ngs.ru
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
> For additional commands, e-mail: dev-help@perl.apache.org
>
>

Mime
View raw message