httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "pqf" <...@mailtech.cn>
Subject Re: Re: mod_fcgid can kill all the services on the server via kill -15 -1
Date Wed, 04 May 2011 04:51:44 GMT
Hi, guys
Here is the new patch, anyone review it? I will commmit it if no one respond :)

Index: fcgid_proc_unix.c
===================================================================
--- fcgid_proc_unix.c   (revision 1099314)
+++ fcgid_proc_unix.c   (working copy)
@@ -402,6 +402,7 @@
     procnode->proc_id = tmpproc;
 
     if (rv != APR_SUCCESS) {
+        memset(&procnode->proc_id, 0, sizeof(procnode->proc_id));
         ap_log_error(APLOG_MARK, APLOG_ERR, rv, procinfo->main_server,
                      "mod_fcgid: can't run %s", wargv[0]);
     }
@@ -414,6 +415,11 @@
     /* su as root before sending signal, for suEXEC */
     apr_status_t rv;
 
+    if (procnode->proc_id.pid == 0) {
+        /* procnode->proc_id.pid be 0 while fcgid_create_privileged_process() fail */
+        return APR_SUCCESS; 
+    }
+
     if (ap_unixd_config.suexec_enabled && seteuid(0) != 0) {
 
         /* can't gain privileges to send signal (should not occur); do NOT
@@ -461,6 +467,7 @@
         /* Destroy pool */
         apr_pool_destroy(procnode->proc_pool);
         procnode->proc_pool = NULL;
+        memset(&procnode->proc_id, 0, sizeof(procnode->proc_id));
 
         return APR_CHILD_DONE;
     }



2011-05-04



Ryan 


发件人: "pqf"<pqf@mailtech.cn>
发送时间: 2011-04-18 09:58
主 题: Re: mod_fcgid can kill all the services on the server via kill -15 -1
收件人: "dev"<dev@httpd.apache.org>



Hi, all
Another question, does proc_wait_process() should update procnode->proc_id to 0 too? or
else mod_fcgid may send a signal to another irrelevant process while apache is shutting down?
I don't follow up mod_fcgid for a while, I just took a glance, maybe it's updated somewhere
else?
By the way, procnode->proc_id is set to 0 while apache startup, so why not update procnode->proc_id
to 0 while fcgid_create_privileged_process() is fail? And then check magic number 0 rather
than both -1 and 0,  in both proc_kill_gracefully() and proc_kill_force().

Cheers.
Ryan


 
Hello,


There is a very interesting, and quite a rare bug in mod_fcgid. It is easy to reproduce if
you can cause fork to fail (which can be done with CloudLinux -- if anyone wants to replicate
it).


Here is how it works: 
mod_fcgid tries to spawn a new process (proc_spawn_process in fcgid_proc_unix.c), but fork
returns -1. 
More exactly fcgid_create_privileged_process function call returns error, and fills in tmpproc.pid
with -1 & tmpproc is assiged to procnode->proc_id).


Now, if at the same time service httpd restart is executed, function kill_all_subprocess in
fcgid_pm_main.c will execute, and it will try to go through all procnodes, sending SIGTERM
via proc_kill_gracefully, (and then SIGKILL via proc_kill_force) to procnode->proc_id.pid
Yet, one procnode will be pointing to procnode->proc_id.pid, causing kill -15 -1 (kill
all).
The end results all services on the server failing, including SSH, apache, syslogd, etc..


I guess the problem is really rare for most people. Also it is quite hard to diagnose, as
it is completely not clear where the signal came from, and it took us some time to correlate
them with apache restarts.. Yet due to our OS being used by shared hosts (where httpd restart
is common thing), and our ability to limit memory available to processes on per virtual host
bases (which causes fork to fail once that virtual host reaches memory limit), we see the
issue quite often.


The solution is quite simple (not sure if it is the best / right solution), in file: fcgid_proc_unix.c,
in methods proc_kill_gracefully, line:


    rv = apr_proc_kill(&(procnode->proc_id), SIGTERM);


should be changed to:
   if (procnode->proc_id.pid != -1) {
          rv = apr_proc_kill(&(procnode->proc_id), SIGTERM);
   } else {
          rv = APR_SUCCESS;
   }


Similarly in proc_kill_force
    rv = apr_proc_kill(&(procnode->proc_id), SIGKILL);
should be changed to:
   if (procnode->proc_id.pid != -1) {
          rv = apr_proc_kill(&(procnode->proc_id), SIGKILL);
   } else {
          rv = APR_SUCCESS;
   }


Regards,
Igor Seletskiy
CEO @ Cloud Linux Inc
Mime
View raw message