+1
A closely related aside... I am not too keen on adding more config directives to control
CGID listen queue depth, number of cgid daemons, etc. Apache is already suffering config
directive overload.
Bill
> I would prefer to make the number of CGI daemons variable personally.
> That way, if I know I have a lot of CGI requests, I can start 5 CGI
> daemons. If I know I have almost no CGI requests, I can start 1.
>
> That should also improve the performance over retrying the connection
> multiple times.
>
> Ryan
>
> ----------------------------------------------
> Ryan Bloom rbb@covalent.net
> 645 Howard St. rbb@apache.org
> San Francisco, CA
>
> > -----Original Message-----
> > From: trawick@rdu163-40-092.nc.rr.com [mailto:trawick@rdu163-40-
> > 092.nc.rr.com] On Behalf Of Jeff Trawick
> > Sent: Monday, January 28, 2002 3:44 AM
> > To: dev@httpd.apache.org
> > Subject: [PATCH] improve mod_cgid robustness
> >
> > On systems that can accept new CGI requests faster than the daemon can
> > fork, it is easy to fill the listen queue on the Unix socket and get
> > ECONNREFUSED, resulting in a 500 error going back to the browser. It
> > is alarming to see a zillion requests for the same resource in the
> > access log, with large numbers of them getting 500 instead of 200.
> >
> > While I suspect that it is appropriate to add a directive to control
> > the listen queue size, that is not sufficient. As long as we are able
> > to fill the queue with the current limit we'll always be able to fill
> > the queue... It'll just take a little (?) longer.
> >
> > This patch retries the connect after a timeout, up to a limit on the
> > number of retries. I suspect that just as we may need a directive for
> > the listen queue size we'll also want a directive for the retry limit.
> >
> > I noticed some socket leakage on error paths. One fix is entangled in
> > the patch below. Probably I will attack the leakage first and then
> > come back and deal with the ECONNREFUSED issue.
> >
> > Index: modules/generators/mod_cgid.c
> > ===================================================================
> > RCS file: /home/cvs/httpd-2.0/modules/generators/mod_cgid.c,v
> > retrieving revision 1.110
> > diff -u -r1.110 mod_cgid.c
> > --- modules/generators/mod_cgid.c 8 Jan 2002 06:26:09 -0000
> 1.110
> > +++ modules/generators/mod_cgid.c 28 Jan 2002 11:36:43 -0000
> > @@ -160,6 +160,16 @@
> > #define DEFAULT_CGID_LISTENBACKLOG 100
> > #endif
> >
> > +/* DEFAULT_CONNECT_ATTEMPTS controls how many times we'll try to
> connect
> > + * to the cgi daemon from the thread/process handling the cgi
> request.
> > + * Generally we want to retry when we get ECONNREFUSED since it is
> > + * probably because the listen queue is full. We need to try harder
> so
> > + * the client doesn't see it as a 500 error.
> > + */
> > +#ifndef DEFAULT_CONNECT_ATTEMPTS
> > +#define DEFAULT_CONNECT_ATTEMPTS 15
> > +#endif
> > +
> > typedef struct {
> > const char *sockname;
> > const char *logname;
> > @@ -863,6 +873,8 @@
> > struct sockaddr_un unix_addr;
> > apr_file_t *tempsock;
> > apr_size_t nbytes;
> > + int connect_tries;
> > + apr_interval_time_t sliding_timer;
> >
> > if(strcmp(r->handler,CGI_MAGIC_TYPE) && strcmp(r->handler,"cgi-
> > script"))
> > return DECLINED;
> > @@ -923,23 +935,46 @@
> > ap_add_cgi_vars(r);
> > env = ap_create_environment(r->pool, r->subprocess_env);
> >
> > - if ((sd = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) {
> > - return log_scripterror(r, conf,
> HTTP_INTERNAL_SERVER_ERROR,
> > errno,
> > - "unable to create socket to cgi
> > daemon");
> > - }
> > memset(&unix_addr, 0, sizeof(unix_addr));
> > unix_addr.sun_family = AF_UNIX;
> > strcpy(unix_addr.sun_path, conf->sockname);
> >
> > - if (connect(sd, (struct sockaddr *)&unix_addr, sizeof(unix_addr))
> <
> > 0) {
> > + connect_tries = 0;
> > + sliding_timer = 100000; /* 100 milliseconds */
> > + while (1) {
> > + ++connect_tries;
> > + if ((sd = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) {
> > return log_scripterror(r, conf,
> HTTP_INTERNAL_SERVER_ERROR,
> > errno,
> > - "unable to connect to cgi
> daemon");
> > - }
> > + "unable to create socket to cgi
> > daemon");
> > + }
> > + if (connect(sd, (struct sockaddr *)&unix_addr,
> sizeof(unix_addr))
> > < 0) {
> > + if (errno == ECONNREFUSED && connect_tries <
> > DEFAULT_CONNECT_ATTEMPTS) {
> > + ap_log_rerror(APLOG_MARK, APLOG_DEBUG, errno, r,
> > + "connect #%d to cgi daemon failed,
> sleeping
> > before retry",
> > + connect_tries);
> > + close(sd);
> > + apr_sleep(sliding_timer);
> > + if (sliding_timer < 2 * APR_USEC_PER_SEC) {
> > + sliding_timer *= 2;
> > + }
> > + }
> > + else {
> > + close(sd);
> > + return log_scripterror(r, conf,
> > HTTP_INTERNAL_SERVER_ERROR, errno,
> > + "unable to connect to cgi
> daemon
> > after multiple tries");
> > + }
> > + }
> > + else {
> > + break; /* we got connected! */
> > + }
> > + }
> >
> > send_req(sd, r, argv0, env, CGI_REQ);
> >
> > /* We are putting the tempsock variable into a file so that we
> can
> > use
> > * a pipe bucket to send the data to the client.
> > + * Note: apr_os_file_put does not establish a cleanup, so we
> better
> > be
> > + * careful.
> > */
> > apr_os_file_put(&tempsock, &sd, 0, r->pool);
> >
> >
> > --
> > Jeff Trawick | trawick@attglobal.net | PGP public key at web site:
> > http://www.geocities.com/SiliconValley/Park/9289/
> > Born in Roswell... married an alien...
>
|