httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Johnson <...@indietorrent.org>
Subject Re: [users@httpd] moving from mod_php to mod_fcgid : rewrite problem
Date Thu, 14 Feb 2013 03:15:40 GMT


On 2/13/2013 4:14 PM, Riccardo Cohen wrote:
> Hi Ben
> 
>>> I tried without the dot : RewriteRule ^en/(.*) index.php/en/$1 but it
>>> gave also an error 404.
>>
>> It would be helpful to know what, exactly, appears in Apache's access
>> log (and/or error log, if you can manage to find that, too) in each of
>> these test cases.
> 
> I've asked for the apache error log, and found no error in it.
> Only one which was a request done before adding the new .htaccess, but
> nothing else :
> 
> [Tue Feb 12 21:04:17 2013] [error] [client 90.24.101.9] File does not
> exist: /datas/vol1/w4a125552/var/www/perspectives-musicales.org/test6

Very good. No problems there.

> 
> The access log show all requests normally with no particular message :
> 
> 90.24.101.9 - - [12/Feb/2013:21:04:46 +0100] "GET /test1/a/b/c HTTP/1.1"
> 404 45 "-" "Mozilla/5.0 (Windows NT 6.1; rv:18.0) Gecko/20100101
> Firefox/18.0" "20130212210446"
> 
> 90.24.101.9 - - [12/Feb/2013:21:04:51 +0100] "GET /test2/a/b/c HTTP/1.1"
> 200 52 "-" "Mozilla/5.0 (Windows NT 6.1; rv:18.0) Gecko/20100101
> Firefox/18.0" "20130212210451"
> 
> 90.24.101.9 - - [12/Feb/2013:21:04:56 +0100] "GET /test4/a/b/c HTTP/1.1"
> 302 206 "-" "Mozilla/5.0 (Windows NT 6.1; rv:18.0) Gecko/20100101
> Firefox/18.0" "20130212210456"
> 
> 90.24.101.9 - - [12/Feb/2013:21:03:28 +0100] "GET /test5/a/b/c HTTP/1.1"
> 404 45 "-" "Mozilla/5.0 (Windows NT 6.1; rv:18.0) Gecko/20100101
> Firefox/18.0" "20130212210328"
> 
> 90.24.101.9 - - [12/Feb/2013:21:04:17 +0100] "GET /test6/a/b/c HTTP/1.1"
> 404 45 "-" "Mozilla/5.0 (Windows NT 6.1; rv:18.0) Gecko/20100101
> Firefox/18.0" "20130212210417"

This seems to imply that Apache is not generating the 404 errors; if it
were, one would expect access log entries to that effect.

> 
>>
>>> These are all my tests : (available at
>>> http://www.perspectives-musicales.org/test1/a/b/c etc.)
>>>
>>> RewriteRule ^test1/(.*) ./test.php/$1
>>> # = error 404
>>
>> I hit this URL and from what I can tell, the 404 response header is
>> coming from PHP, not Apache. The output is "No input file specified."
>> This doesn't look like a "stock" Apache 404 response. Did you build
>> logic into test.php that emits a 404 response header and this message
>> when some parameter is absent from the URL?
> 
> test.php is only this :
> 
> ok test
> <br>
> <?
> $info=$_SERVER["PATH_INFO"];
> echo "INFO=".$info."<br>";
> $query=$_SERVER["QUERY_STRING"];
> echo "query=".$query."<br>";
> ?>
> 
> maybe the error comes from mod_fcgid itself ?

Quite possibly. In fact, a search for "mod_fcgid No input file
specified" yields the following article:

http://isp-control.net/forum/printthread.php?tid=12653

Of particular import is the suggestion, "Okay, this may be caused by
either (1) apache sending an incorrect path to the php file to php5-cgi;
or (2) something (permissions?) that prevents php5-cgi from running the
script."

Do other PHP scripts function as expected when executed via mod_fcgid?
Or do they all return the error string, "No input file specified" and a
404 response?

>>
>>> RewriteRule ^test2/(.*) ./test.php?$1
>>> # = parameters are in query_string instead of path_info
>>
>> Why is this a problem?
> 
> My whole web application is developped with urls like
> 
> http://www.perspectives-musicales.org/en/all-associations
> 
> for search engine optimizations, where "en" and "all-associations" are
> not pages or directories, but program arguments (replacing
> "?lang=en&command=all-associations" which are poor seo)

Right; I built a PHP framework that uses so-called "clean-URLs", and am
well-versed in the theory behind this approach, as well as its
execution. The rationale seems sound.

> So, as explained in my first email, all arguments to my application
> controller are in $_SERVER["PATH_INFO"] (and not
> $_SERVER["QUERY_STRING"]). And that did work like a charm with
> mod_php... Changing all my application with data in query_string is not
> very complicated if I wrote a good program ( :) ) but will need a lot of
> checks.
> 
> Actually at the point where I am now, i've already spent some time on it...

My PHP framework functions the same way via mod_php as it does with
mod_fcgid and mod_fastcgi. I achieved this by using a well-known
technique to rewrite the URLs (I place these directives into the
site-root's .htaccess file):

<IfModule mod_rewrite.c>
  RewriteEngine on
  Options All

  # Modify the RewriteBase if you are using a subdirectory and the
  # rewrite rules are not working properly:
  # WARNING: Do not include a trailing slash on this directive if you
  # include a path other than /!
  #RewriteBase /

  # Rewrite URIs of the form 'index.php?q=x' (except for real
  # files/directories):
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d

  RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]
</IfModule>

(WordPress, Joomla, and many other frameworks do something similar.)

Then, in PHP, $_GET['q'] will always contain the "clean URL" (unless, of
course, the 'q' value is overwritten, e.g., the URL contains
"?q=something-else"). For this reason, you may wish to use something
other than "q" in the RewriteRule value. You can then parse the
clean-URL to obtain its "individual segments" and do with them as you
will. While over simplified, an example is to call explode('/',
trim($_GET['q'], '/')) in PHP. This will return an array that contains
the various "path segments". The URL
http://www.perspectives-musicales.org/en/all-associations would return

array (size=2)
  0 => string 'en' (length=2)
  1 => string 'all-associations' (length=16)

Granted, undertaking this approach would mean rewriting certain aspects
of your application, but chances are that you'll thank yourself later.
You'll have a much more portable application that is
scripting-language-agnostic, with respect to URL structure. (Switching
to another scripting language requires a simple change to your
RewriteRule only.)

>>
>> It should be stated that mod_php and mod_fcgid populate these values in
>> different ways. From what I understand, PATH_INFO is less reliable and
>> less well-implemented than QUERY_STRING. Fundamentally, this is why you
>> are observing different behavior/values here after moving from mod_php
>> to mod_fcgid.
> 
> I'm not sure that this is a problem with the PATH_INFO variable since
> the error occurs even before php has any chance to start executing (the
> test.php is not executed at all in test1)

This may be for the reasons outlined in the article that I cited above.
If you'd like to post your CGI wrapper script, I'd be happy to take a
look. Alas, you may lack access to this script, in which case, it's a
moot point. Although, I must say, it seems unlikely that your host would
have misconfigured the wrapper script. (Then again, we've all seen worse.)

>>> RewriteRule ^test3/(.*) ./test.php?/$1
>>> # = parameters are in query_string instead of path_info
>>
>> Same as above.
>>
>>> RewriteRule ^test4/(.*)
>>> http://www.perspectives-musicales.org/test.php/$1
>>> # = redirection 302
>>
>> I don't see a 302 response for this one. I see the same 404 and message
>> as above. Maybe you changed something after sending this message.
> 
> I use firefox http live header and it shows a status code 302 ("HTTP/1.1
> 302 Found") then the browser redirect to the page as if it was another
> website

You're right; I checked this again, and I do see the 302 redirect. I
think it was a matter of enabling the "Persist" feature in Firebug.
(Otherwise, the "Net" panel is refreshed after the redirect is sent.)
Thanks for double-checking your work here!

> I still think that [apache or mod_fcgid] cannot execute test.php in
> test1 just because it thinks it is a directory and cannot find it.

That may very well be. And the solution I offered above should address
that shortcoming.

I can't tell you exactly why it doesn't work (only a VPS with shell
access would make that possible), but I can tell you what *does* work.

I'm happy to answer any questions.

Good luck!

-Ben

> 
>>
>>> RewriteRule ^test5/(.*) test.php/$1
>>> # = error 404
>>>
>>> RewriteRule ^test6/(.*) /test.php/$1
>>> # = error 404
>>
>> Same as the others with 404 responses.
>>
>>>
>>>
>>> Thanks for your help.
>>
>> You're welcome. I'll wait to hear back before offering additional
>> information.
>>
>> -Ben
>>
>>>
>>>
>>> On 12/02/13 19:40, Ben Johnson wrote:
>>>>
>>>>
>>>> On 2/12/2013 10:59 AM, Riccardo Cohen wrote:
>>>>> Thanks Ben, here are the answers :
>>>>>
>>>>>> 1.) Where have you defined the rewrite rule? In a .htaccess file?
>>>>>
>>>>> in .htaccess
>>>>>
>>>>>> 2.) Have you defined a RewriteBase? If so, what is it?
>>>>>
>>>>> no change with or without
>>>>>
>>>>>> 3.) Have you reviewed Apache's access log at all?
>>>>>
>>>>> I'll have a look now
>>>>>
>>>>>> 4.) Have you increased RewriteLogLevel to, say, 4, to see exactly
>>>>>> what
>>>>>> the mod_rewrite engine is doing?
>>>>>
>>>>> I'll try that. Is it possible to set it in .htacces or must I change
>>>>> global apache configuration (I only have access to my .htaccess in
>>>>> this
>>>>> hosting).
>>>>
>>>> Unfortunately, RewriteLogLevel can be set in the "server config" and
>>>> "virtual host" contexts only. (You can make this type of determination
>>>> in the future by visiting the manual page and looking for the "context"
>>>> value:
>>>> http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html#rewriteloglevel .)
>>>>
>>>>
>>>> This is one of many reasons for which hosting on a VPS over which you
>>>> have complete control is beneficial.
>>>>
>>>> In any case, we'll have to proceed without access to the rewrite log.
>>>>
>>>> Is there a specific reason for which you're using "./index.php" in the
>>>> right-hand side of the rule? I'm referring to the period ("."), in
>>>> particular. This may well be the source of the problem. It could be
>>>> that
>>>> mod_php interprets that relative path (./index.php) "correctly",
>>>> whereas
>>>> mod_fcgid does not.
>>>>
>>>> Try this:
>>>>
>>>> RewriteRule ^en/(.*) index.php/en/$1
>>>>
>>>> -Ben
>>>>
>>>>
>>>>
>>>>> Thanks
>>>>>
>>>>> On 12/02/13 14:53, Ben Johnson wrote:
>>>>>>
>>>>>>
>>>>>> On 2/12/2013 2:16 AM, Riccardo Cohen wrote:
>>>>>>> Hello
>>>>>>> I received some clues from this list members, thanks for that.
But
>>>>>>> unfortunately my problem is not solved.
>>>>>>>
>>>>>>> It's not that I want others to focus on me, but I'm quite sure
that
>>>>>>> there is a real problem (if not why would it work perfectly on
>>>>>>> mod_php
>>>>>>> ?), I could not find any solution googling about it (even with
the
>>>>>>> help
>>>>>>> of the host technical team), and I would like a confirmation
that 1)
>>>>>>> it's not an error from my understanding, and 2) there is no
>>>>>>> workaround
>>>>>>> for it.
>>>>>>
>>>>>> I doubt it is a problem with the software. mod_rewrite has been put
>>>>>> through the paces over the years and I'd be shocked if a bug were
>>>>>> uncovered given your rule's relative simplicity.
>>>>>>
>>>>>> Before digesting your post in its entirety, I have a couple of
>>>>>> questions
>>>>>> first.
>>>>>>
>>>>>> 1.) Where have you defined the rewrite rule? In a .htaccess file?
>>>>>>
>>>>>> 2.) Have you defined a RewriteBase? If so, what is it?
>>>>>>
>>>>>> 3.) Have you reviewed Apache's access log at all?
>>>>>>
>>>>>> 4.) Have you increased RewriteLogLevel to, say, 4, to see exactly
>>>>>> what
>>>>>> the mod_rewrite engine is doing?
>>>>>>
>>>>>> -Ben
>>>>>>
>>>>>>
>>>>>>
>>>>>>> So I'll be very pleased to here from some qualified developer
>>>>>>> before I
>>>>>>> spend 2 days to modify and retest all my application.
>>>>>>>
>>>>>>> Thanks in advance.
>>>>>>>
>>>>>>> On 07/02/13 11:17, Riccardo Cohen wrote:
>>>>>>>> Sorry to insist but I'm really blocked and I really need
help.
>>>>>>>> Here is a small summary for those who don't want to read
all :
>>>>>>>>
>>>>>>>> I want to make a rewrite from :
>>>>>>>>
>>>>>>>> http://www.perspectives-musicales.org/en/all-albums
>>>>>>>> to
>>>>>>>> http://www.perspectives-musicales.org/index.php/en/all-albums
>>>>>>>>
>>>>>>>> my rewrite rule is
>>>>>>>>
>>>>>>>> RewriteRule ^en/(.*) ./index.php/en/$1
>>>>>>>>
>>>>>>>> This works when apache is runnnig with mod_php, but not when
>>>>>>>> running
>>>>>>>> mod_fcgid (php as cgi). In cgi mode I have a 404 error.
>>>>>>>>
>>>>>>>> The Apache version is 2.2.23 and mod_fcgid is version 2.3.7
with
>>>>>>>> configuration flag cgi.fix_pathinfo=1
>>>>>>>>
>>>>>>>> Thanks for your help.
>>>>>>>>
>>>>>>>>
>>>>>>>> On 05/02/13 21:32, Riccardo Cohen wrote:
>>>>>>>>> Hello
>>>>>>>>> I'm new to apache mailing list, sorry if I'm not 100%
clear, and
>>>>>>>>> sorry
>>>>>>>>> for this long description.
>>>>>>>>>
>>>>>>>>> I have developped a website with php/mysql :
>>>>>>>>> http://www.perspectives-musicales.org and placed it on
a good
>>>>>>>>> hosting
>>>>>>>>> service (web4all.fr).
>>>>>>>>> To improve search engine rank I decided to set all urls
to
>>>>>>>>> /index.php/... and rewrite them to avoid having index.php
in url
>>>>>>>>> (sort
>>>>>>>>> of MVC technique combined with SEO...)
>>>>>>>>>
>>>>>>>>> Example : the catalog is at url :
>>>>>>>>> http://www.perspectives-musicales.org/en/all-albums
>>>>>>>>> This should be transparantly mapped to
>>>>>>>>> http://www.perspectives-musicales.org/index.php/en/all-albums
>>>>>>>>> thanks to
>>>>>>>>> the rewrite rule :
>>>>>>>>>
>>>>>>>>> RewriteRule ^en/(.*) ./index.php/en/$1
>>>>>>>>>
>>>>>>>>> My application uses then $_SERVER["PATH_INFO"] (and not
>>>>>>>>> $_SERVER["QUERY_STRING"]) to retreive url information.
This worked
>>>>>>>>> perfectly until last month, because web4all.fr changed
the whole
>>>>>>>>> system
>>>>>>>>> and separated apache from php, using fast cgi instead
of mod_php.
>>>>>>>>>
>>>>>>>>> The system is supposed to be more reliable and more efficient
like
>>>>>>>>> this,
>>>>>>>>> and apparently is. But the rewrite rule does not work
anymore.
>>>>>>>>> So I
>>>>>>>>> investigated and made some test :
>>>>>>>>>
>>>>>>>>> I have a small test.php that displays the path_info and
>>>>>>>>> query_string.
>>>>>>>>> You can presently test it here :
>>>>>>>>>
>>>>>>>>> http://perspectives-musicales.org/test1/a/b/c
>>>>>>>>> http://perspectives-musicales.org/test2/a/b/c
>>>>>>>>> http://perspectives-musicales.org/test3/a/b/c
>>>>>>>>> http://perspectives-musicales.org/test4/a/b/c
>>>>>>>>>
>>>>>>>>> and I set the following rules :
>>>>>>>>>
>>>>>>>>> RewriteRule ^test1/(.*) ./test.php/$1
>>>>>>>>> RewriteRule ^test2/(.*) ./test.php?$1
>>>>>>>>> RewriteRule ^test3/(.*) ./test.php?/$1
>>>>>>>>> RewriteRule ^test4/(.*)
>>>>>>>>> http://www.perspectives-musicales.org/test.php/$1
>>>>>>>>>
>>>>>>>>> None of these 4 rewrite rules are convenient. Here is
why :
>>>>>>>>>
>>>>>>>>> - test1 : the system anwsers 404 "No input file specified".
I
>>>>>>>>> think
>>>>>>>>> (not
>>>>>>>>> sure) that Apache beleives that test.php is a folder,
and cannot
>>>>>>>>> find it
>>>>>>>>> so answers 404
>>>>>>>>>
>>>>>>>>> - test2 : the rewrite rule works, but of course the url
>>>>>>>>> information is
>>>>>>>>> no more in path_info, it is in query_string as shown
in the page
>>>>>>>>> content
>>>>>>>>>
>>>>>>>>> - test3 : same as test2
>>>>>>>>>
>>>>>>>>> - test4 : almost good, I can have the url info in path_info,
but
>>>>>>>>> apache
>>>>>>>>> begins first with a 302 redirection and then changes
the url to
>>>>>>>>> http://www.perspectives-musicales.org/test.php/a/b/c,
which
>>>>>>>>> looses all
>>>>>>>>> search engine efficiency (and also eventual POST variables
if
>>>>>>>>> any).
>>>>>>>>>
>>>>>>>>> My host tried several searches on forums including this
one, and
>>>>>>>>> could
>>>>>>>>> not find any answer. It seems to be an apache bug, but
not sure, I
>>>>>>>>> have
>>>>>>>>> no bug number to give anyway. If it is a bug, it is demontrated
by
>>>>>>>>> test1
>>>>>>>>> I think.
>>>>>>>>>
>>>>>>>>> So here is my question : Is there any way to make this
rewrite
>>>>>>>>> rule
>>>>>>>>> work
>>>>>>>>> in fastcgi mode, and what is the syntax for it, to keep
info in
>>>>>>>>> path_info without 302 redirection. The Apache version
is
>>>>>>>>> 2.2.23  and
>>>>>>>>> mod_fcgid is version 2.3.7 with configuration flag
>>>>>>>>> cgi.fix_pathinfo=1
>>>>>>>>>
>>>>>>>>> If there is a way, thanks for your help I'd be glad to
test it.
>>>>>>>>> If no
>>>>>>>>> could you explain why and how to solve it. As workaround
we used
>>>>>>>>> test4
>>>>>>>>> syntax in the whole site, to make it work, but it is
bad for
>>>>>>>>> search
>>>>>>>>> engine, and creates problem in backoffice (because certain
>>>>>>>>> backoffice
>>>>>>>>> functions use POST variables)
>>>>>>>>>
>>>>>>>>> I know I can change my code to use query_string everywhere
>>>>>>>>> instead of
>>>>>>>>> path_info, but if I can avoid changing and testing all
my
>>>>>>>>> websites it
>>>>>>>>> would be really great
>>>>>>>>>
>>>>>>>>> Thanks a lot for your anwser.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>>>>>> For additional commands, e-mail: users-help@httpd.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>>>> For additional commands, e-mail: users-help@httpd.apache.org
>>>>
>>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>> For additional commands, e-mail: users-help@httpd.apache.org
>>
>>
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Mime
View raw message