perl-modperl mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Torsten Förtsch <>
Subject Re: $r->path_info unreliable?
Date Mon, 23 Jun 2014 07:11:00 GMT
On 19/06/14 02:13, Worik Stanton wrote:
> In my handler I call $r->path_info to determine the path used to
> call my script.
> I am trying to have a test version of my code using the same module
> but which reads different data based on the path.
> So I have:
> <Location /MyPackage2> SetHandler  perl-script PerlResponseHandler
> Apache::MyPackage::FrontEnd2 </Location> <Location
> /MyPackage2Test> SetHandler  perl-script PerlResponseHandler
> Apache::MyPackage::FrontEnd2 </Location>
> In FrontEnd2....
> sub handler { my $r =3D shift; warn $r->path_info; if($r->path_info
> =3D~ /test/i){ ## Load test data }else{ ## Load real data }
> This works for another package I have exactly like this, but in
> this case $r->path_info is empty.
> I am stumped.

What many people don't understand is what path_info really is. When
you get that right, you'll see why it is very fragile and almost
useless in a modperl context.

Path_info is computed in the maptostorage phase. Suppose first the
non-modperl case with a simple configuration that only sets
DocumentRoot without any Aliases and similar stuff. In that case, the
request uri is simply appended to DocumentRoot and the result taken as
the name of a file on disk:

DocumentRoot: /web/root
URI:          /path/to/index.html

==> resulting file name: /web/root/path/to/index.html

Now, if that's a regular file on disk, everything is fine. What if not?

You could return a 404 as soon as you figure that out.

But if /web/root/path exists as a regular file and is configured to be
a CGI script, then you probably want to call it. In this case, the
remaining part if the uri, /to/index.html, goes to path_info,
/web/root/path becomes the filename and HTTPD goes on with the request.

I am not quite sure what happens if /web/root/path is a directory
instead of a regular file. But I think, the resulting
filename/path_info pair would be the same.

So, the splitting of uri into filename and path_info depends not only
on the web server configuration but also on the layout of files and
directories on disk. That's why I said, it's fragile. You create a new
directory because you want to put some static information there and
all of the sudden the application stops to work.

In a modperl context, I think it'd make much more sense to split the
uri like this:

  filename = docroot . substr(uri, 0, length(location))
  path_info = substr(uri, length(location))

But that's not easy to achieve in the general case. You must take into
account all sorts of aliases, <Directory>, <DirectoryMatch>,
<Location> and <LocationMatch> directives.

And you must consider that although an apache module can take over the
maptostorage phase and consequently skip all that <Directory> stuff,
it cannot avoid the processing of <Location> and <LocationMatch>.
These are not encapsulated as a separate request phase but simply
happen between the maptostorage and the headerparser phase. Also,
there is the problem with subrequests and internal redirects.
Depending on their type they enter the request processing cycle either
in the uri translation or the maptostorage phase. And they can (at
least in earlier httpd versions) skip the header parser phase. So, the
only place where you could possibly patch up the request accordingly
is the fixup or type checking phases just before response. But at this
point so much has already happened to the request (header parser,
access control, type checking) that to not introduce security bugs
you'd have to go back to header parser at least. So, there is no way
for modperl to get it right and sensible from the perl perspective.


View raw message