httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dean Gaudet <dgau...@arctic.org>
Subject [PATCH] accept serialization
Date Fri, 08 Aug 1997 10:52:59 GMT
Here's a big one.  I've tested this on linux 2.0.30, solaris 2.5.1, and
irix 5.3.  It implements Sys V semaphores, posix mutexes (apparently only
solaris works with this), and uslocks (irix).  The sysvsem implementation
goes to great effort to clean up after itself.

All of those have run successfully under a 20 second attack of "while 1;
kill -USR1" and "while 1; kill -HUP".  I didn't see any resource leakages
... I may not have looked hard enough.  I tested sysv semaphores on Linux,
Solaris and IRIX.  I tested pthreads on Solaris, and uslocks on IRIX.
The parent should only leak a semaphore if it's kill -9d or seg faults
(we can probably clean up the seg fault case too).

This patch also gives us the option of unserializing the single listen
case, but leaving the multiple Listen case serialized, by defining
SAFE_UNSERIALIZED_ACCEPT.

Martijn told me that he's been running his IRIX servers without any
serialization and it works better for him .  I suspect that's because
fcntl/flock are extremely lame under IRIX.  But I haven't decided to make
SAFE_UNSERIALIZED_ACCEPT the default for IRIX in this patch ... I'd love
to have a test program that can be used to figure out if a system should
go serial or not.

Important note about SysVsem:  It is possible for a CGI running as
the same uid as httpd to find the semaphore, and acquire it without
SEM_UNDO, and then exit... and the stupid semaphore will be locked
forever, effectively shutting down the server.  I can see no workaround
except suexec, this is yet another deficiency in the sysvsem interface.
(Bad things also happen if someone stupidly removes the semaphore while
the server is still running, but they deserve what they get.)

Having said that, I still think it's worthwhile to include.  If you
remember the numbers for IRIX it is a big performance win to default to
using it on IRIX.  Multi-CPU IRIX boxes should definately use USLOCK ...
single cpu boxes don't have any faster alternative.  My patch leaves it
up to the admin to build it with USLOCK instead.

My proposal is to include this patch, and then see what happens during
the betas.  It's easy to switch back to the sturdy 1.2 defaults later
on down the road.

Dean

P.S. One more future improvement for this will be to error_log a warning
when people have no serialization and multiple Listen statements.

P.P.S. DEFAULT_TYPE is changed to DEFAULT_CONTENT_TYPE because pthread.h
defines DEFAULT_TYPE.

Index: CHANGES
===================================================================
RCS file: /export/home/cvs/apache/src/CHANGES,v
retrieving revision 1.390
diff -u -r1.390 CHANGES
--- CHANGES	1997/08/08 08:09:06	1.390
+++ CHANGES	1997/08/08 10:36:33
@@ -1,7 +1,31 @@
 Changes with Apache 1.3a2
 
-  *) PORT: Timings show that on Linux 2.0.30 flock is slightly cheaper than
-     fcntl file locking.  So use flock for the mutexes. [Dean Gaudet]
+  *) PORT: Apache has need for mutexes to serialize its children around
+     accept.  In prior versions either fcntl file locking or flock file
+     locking were used.  The method is chosen by the definition of
+     USE_xxx_SERIALIZED_ACCEPT in conf.h.  xxx is FCNTL for fcntl(),
+     and FLOCK for flock().  New options have been added:
+	- SYSVSEM to use System V style semaphores
+	- PTHREAD to use Posix threads (appears to work on Solaris only)
+	- USLOCK to use IRIX uslock
+     Based on timing various techniques, the following changes were made
+     to the defaults:
+	- Linux 2.x uses flock instead of fcntl
+	- Solaris 2.x uses pthreads
+	- IRIX uses SysV semaphores -- however multiprocessor IRIX boxes
+	    work far faster if you -DUSE_USLOCK_SERIALIZED_ACCEPT
+     [Dean Gaudet, Pierre-Yves Kerembellec <Pierre-Yves.Kerembellec@vtcom.fr>,
+     Martijn Koster <m.koster@pobox.com>]
+
+  *) PORT: The semantics of accept/select make it very desirable to use
+     mutexes to serialize accept when multiple Listens are in use.  But
+     in the case where only a single socket is open it is sometimes
+     redundant to serialize accept().  Not all unixes do a good job with
+     potentially dozens of children blocked on accept() on the same
+     socket.  It's now possible to define SAFE_UNSERIALIZED_ACCEPT and
+     the server will avoid serialization when listening on only one socket,
+     and use serialization when listening on multiple sockets.
+     [Dean Gaudet] PR#467
 
   *) Fix another long-standing bug in sub_req_lookup_file where it would
      happily skip past access checks on subdirectories looked up with
Index: Configure
===================================================================
RCS file: /export/home/cvs/apache/src/Configure,v
retrieving revision 1.125
diff -u -r1.125 Configure
--- Configure	1997/08/04 23:45:59	1.125
+++ Configure	1997/08/08 10:36:35
@@ -391,7 +391,7 @@
     	SOLVER=`echo $PLAT | sed -e 's/^.*solaris2.//'`
 	OS="Solaris $SOLVER"
 	CFLAGS="$CFLAGS -DSOLARIS2=$SOLVER"
-	LIBS="$LIBS -lsocket -lnsl"
+	LIBS="$LIBS -lsocket -lnsl -lpthread"
 	DBM_LIB=""
 	case "$SOLVER" in
 	    2[0123]*)
Index: conf.h
===================================================================
RCS file: /export/home/cvs/apache/src/conf.h,v
retrieving revision 1.122
diff -u -r1.122 conf.h
--- conf.h	1997/08/07 09:39:48	1.122
+++ conf.h	1997/08/08 10:36:36
@@ -100,7 +100,10 @@
 #define HAVE_SYS_RESOURCE_H
 #define bzero(a,b) memset(a,0,b)
 #define JMP_BUF sigjmp_buf
-#define USE_FCNTL_SERIALIZED_ACCEPT
+/*#define USE_FCNTL_SERIALIZED_ACCEPT*/
+/*#define USE_SYSVSEM_SERIALIZED_ACCEPT*/
+#define USE_PTHREAD_SERIALIZED_ACCEPT
+#define NEED_UNION_SEMUN
 #define HAVE_MMAP
 #define HAVE_CRYPT_H
 int gethostname(char *name, int namelen);
@@ -112,7 +115,8 @@
 #define NO_KILLPG
 #undef NO_SETSID
 #define JMP_BUF sigjmp_buf
-#define USE_FCNTL_SERIALIZED_ACCEPT
+/*#define USE_FCNTL_SERIALIZED_ACCEPT*/
+#define USE_SYSVSEM_SERIALIZED_ACCEPT
 #define HAVE_SHMGET
 #define HAVE_CRYPT_H
 #define NO_LONG_DOUBLE
Index: http_core.c
===================================================================
RCS file: /export/home/cvs/apache/src/http_core.c,v
retrieving revision 1.109
diff -u -r1.109 http_core.c
--- http_core.c	1997/08/06 20:21:24	1.109
+++ http_core.c	1997/08/08 10:36:44
@@ -352,7 +352,7 @@
     core_dir_config *conf = 
       (core_dir_config *)get_module_config(r->per_dir_config, &core_module); 
 
-    return conf->default_type ? conf->default_type : DEFAULT_TYPE;
+    return conf->default_type ? conf->default_type : DEFAULT_CONTENT_TYPE;
 }
 
 API_EXPORT(char *) document_root (request_rec *r) /* Don't use this!!! */
Index: http_main.c
===================================================================
RCS file: /export/home/cvs/apache/src/http_main.c,v
retrieving revision 1.198
diff -u -r1.198 http_main.c
--- http_main.c	1997/08/08 08:00:19	1.198
+++ http_main.c	1997/08/08 10:36:52
@@ -261,18 +261,248 @@
 }
 #endif
 
-#if defined(USE_FCNTL_SERIALIZED_ACCEPT)
+#if defined (USE_USLOCK_SERIALIZED_ACCEPT)
+
+#include <ulocks.h>
+
+static ulock_t uslock = NULL;
+
+#define accept_mutex_cleanup()
+
+static void accept_mutex_init(pool *p)
+{
+    ptrdiff_t old;
+    usptr_t *us;
+
+
+    /* default is 8, allocate enough for all the children plus the parent */
+    if ((old = usconfig(CONF_INITUSERS, HARD_SERVER_LIMIT+1)) == -1) {
+        perror("usconfig(CONF_INITUSERS)");
+        exit(-1);
+    }
+
+    if ((old = usconfig(CONF_LOCKTYPE, US_NODEBUG)) == -1) {
+        perror("usconfig(CONF_LOCKTYPE)");
+        exit(-1);
+    }
+    if ((old = usconfig(CONF_ARENATYPE, US_SHAREDONLY)) == -1) {
+        perror("usconfig(CONF_ARENATYPE)");
+        exit(-1);
+    }
+    if ((us = usinit("/dev/zero")) == NULL) {
+        perror("usinit");
+        exit(-1);
+    }
+
+    if ((uslock = usnewlock(us)) == NULL) {
+        perror("usnewlock");
+        exit(-1);
+    }
+}
+
+static void accept_mutex_on()
+{
+    switch(ussetlock(uslock)) {
+        case 1:
+            /* got lock */
+            break;
+        case 0:
+            fprintf(stderr, "didn't get lock\n");
+            exit(-1);
+        case -1:
+            perror("ussetlock");
+            exit(-1);
+    }
+}
+
+static void accept_mutex_off()
+{
+    if (usunsetlock(uslock) == -1) {
+        perror("usunsetlock");
+        exit(-1);
+    }
+}
+
+#elif defined (USE_PTHREAD_SERIALIZED_ACCEPT)
+
+/* This code probably only works on Solaris ... but it works really fast
+ * on Solaris
+ */
+
+#include <pthread.h>
+
+static pthread_mutex_t *accept_mutex;
+
+static void accept_mutex_cleanup(void)
+{
+    if (munmap ((caddr_t)accept_mutex, sizeof (*accept_mutex))) {
+	perror ("munmap");
+    }
+}
+
+static void accept_mutex_init(pool *p)
+{
+    pthread_mutexattr_t mattr;
+    int fd;
+
+    fd = open ("/dev/zero", O_RDWR);
+    if (fd == -1) {
+        perror ("open(/dev/zero)");
+        exit (1);
+    }
+    accept_mutex = (pthread_mutex_t *)mmap ((caddr_t)0, sizeof (*accept_mutex),
+                    PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
+    if (accept_mutex == (void *)(caddr_t)-1) {
+        perror ("mmap");
+        exit (1);
+    }
+    close (fd);
+    if (pthread_mutexattr_init(&mattr)) {
+        perror ("pthread_mutexattr_init");
+        exit (1);
+    }
+    if (pthread_mutexattr_setpshared(&mattr, PTHREAD_PROCESS_SHARED)) {
+        perror ("pthread_mutexattr_setpshared");
+        exit (1);
+    }
+    if (pthread_mutex_init(accept_mutex, &mattr)) {
+        perror ("pthread_mutex_init");
+        exit (1);
+    }
+}
+
+static void accept_mutex_on()
+{
+    if (pthread_mutex_lock (accept_mutex)) {
+        perror ("pthread_mutex_lock");
+        exit (1);
+    }
+}
+
+static void accept_mutex_off()
+{
+    if (pthread_mutex_unlock (accept_mutex)) {
+        perror ("pthread_mutex_unlock");
+        exit (1);
+    }
+}
+
+#elif defined (USE_SYSVSEM_SERIALIZED_ACCEPT)
+
+#include <sys/types.h>
+#include <sys/ipc.h>
+#include <sys/sem.h>
+
+#ifdef NEED_UNION_SEMUN
+/* it makes no sense, but this isn't defined on solaris */
+union semun {
+    int val;
+    struct semid_ds *buf;
+    ushort *array;
+};
+#endif
+
+static int sem_cleanup_registered;
+static pid_t sem_cleanup_pid;
+static int sem_id = -1;
+static struct sembuf op_on;
+static struct sembuf op_off;
+
+/* We get a random semaphore ... the lame sysv semaphore interface
+ * means we have to be sure to clean this up or else we'll leak
+ * semaphores.  Note that this is registered via atexit, so we can't
+ * do many things in here... especially nothing that would allocate
+ * memory or use a FILE *.
+ */
+static void accept_mutex_cleanup (void)
+{
+    union semun ick;
+
+    if (sem_id < 0) return;
+    if (getpid() != sem_cleanup_pid) return;
+    /* this is ignored anyhow */
+    ick.val = 0;
+    semctl (sem_id, 0, IPC_RMID, ick);
+}
+
+
+static void accept_mutex_init(pool *p)
+{
+    union semun ick;
+    struct semid_ds buf;
+
+    if (!sem_cleanup_registered) {
+	/* only the parent will try to do cleanup */
+	sem_cleanup_pid = getpid();
+	if (atexit (accept_mutex_cleanup)) {
+	    perror ("atexit");
+	    exit (1);
+	}
+	sem_cleanup_registered = 1;
+    }
+    /* acquire the semaphore */
+    sem_id = semget(IPC_PRIVATE, 1, IPC_CREAT | 0600);
+    if (sem_id < 0) {
+       perror ("semget");
+       exit (1);
+    }
+    ick.val = 1;
+    if (semctl(sem_id, 0, SETVAL, ick) < 0) {
+	perror ("semctl(SETVAL)");
+        exit (1);
+    }
+    if (!getuid()) {
+	/* restrict it to use only by the appropriate user_id ... not that this
+	 * stops CGIs from acquiring it and dinking around with it.
+	 */
+	buf.sem_perm.uid = user_id;
+	buf.sem_perm.gid = group_id;
+	buf.sem_perm.mode = 0600;
+	ick.buf = &buf;
+	if (semctl(sem_id, 0, IPC_SET, ick) < 0) {
+	    perror ("semctl(IPC_SET)");
+	    exit (1);
+	}
+    }
+
+    /* pre-initialize these */
+    op_on.sem_num = 0;
+    op_on.sem_op = -1;
+    op_on.sem_flg = SEM_UNDO;
+    op_off.sem_num = 0;
+    op_off.sem_op = 1;
+    op_off.sem_flg = SEM_UNDO;
+}
+
+static void accept_mutex_on()
+{
+    if (semop(sem_id, &op_on, 1) < 0) {
+        perror ("accept_mutex_on");
+        exit (1);
+    }
+}
+
+static void accept_mutex_off()
+{
+    if (semop(sem_id, &op_off, 1) < 0) {
+        perror ("accept_mutex_off");
+        exit (1);
+    }
+}
+
+#elif defined(USE_FCNTL_SERIALIZED_ACCEPT)
 static struct flock lock_it;
 static struct flock unlock_it;
 
 static int lock_fd=-1;
 
+#define accept_mutex_cleanup()
+
 /*
  * Initialize mutex lock.
  * Must be safe to call this on a restart.
  */
-void
-accept_mutex_init(pool *p)
+static void accept_mutex_init(pool *p)
 {
 
     lock_it.l_whence = SEEK_SET;   /* from current point */
@@ -297,7 +527,7 @@
     unlink(lock_fname);
 }
 
-void accept_mutex_on(void)
+static void accept_mutex_on(void)
 {
     int ret;
     
@@ -311,7 +541,7 @@
     }
 }
 
-void accept_mutex_off(void)
+static void accept_mutex_off(void)
 {
     if (fcntl (lock_fd, F_SETLKW, &unlock_it) < 0)
     {
@@ -320,16 +550,18 @@
 	exit(1);
     }
 }
+
 #elif defined(USE_FLOCK_SERIALIZED_ACCEPT)
 
 static int lock_fd=-1;
 
+#define accept_mutex_cleanup()
+
 /*
  * Initialize mutex lock.
  * Must be safe to call this on a restart.
  */
-void
-accept_mutex_init(pool *p)
+static void accept_mutex_init(pool *p)
 {
 
     expand_lock_fname (p);
@@ -343,7 +575,7 @@
     unlink(lock_fname);
 }
 
-void accept_mutex_on(void)
+static void accept_mutex_on(void)
 {
     int ret;
     
@@ -357,7 +589,7 @@
     }
 }
 
-void accept_mutex_off(void)
+static void accept_mutex_off(void)
 {
     if (flock (lock_fd, LOCK_UN) < 0)
     {
@@ -366,15 +598,28 @@
 	exit(1);
     }
 }
+
 #else
 /* Default --- no serialization.  Other methods *could* go here,
  * as #elifs...
  */
+#define accept_mutex_cleanup()
 #define accept_mutex_init(x)
 #define accept_mutex_on()
 #define accept_mutex_off()
 #endif
 
+/* On some architectures it's safe to do unserialized accept()s in the
+ * single Listen case.  But it's never safe to do it in the case where
+ * there's multiple Listen statements.  Define SAFE_UNSERIALIZED_ACCEPT
+ * when it's safe in the single Listen case.
+ */
+#ifdef SAFE_UNSERIALIZED_ACCEPT
+#define SAFE_ACCEPT(stmt) do {if(listeners->next != listeners) {stmt;}} while(0)
+#else
+#define SAFE_ACCEPT(stmt) do {stmt;} while(0)
+#endif
+
 void usage(char *bin)
 {
     fprintf(stderr,"Usage: %s [-d directory] [-f file] [-v] [-h] [-l]\n",bin);
@@ -1471,6 +1716,7 @@
 void sig_term(int sig) {
     log_error("httpd: caught SIGTERM, shutting down", server_conf);
     cleanup_scoreboard();
+    accept_mutex_cleanup();
 #ifdef SIGKILL
     ap_killpg (pgrp, SIGKILL);
 #endif /* SIGKILL */
@@ -2380,7 +2626,8 @@
          * Wait for an acceptable connection to arrive.
          */
 
-        accept_mutex_on();  /* Lock around "accept", if necessary */
+	/* Lock around "accept", if necessary */
+        SAFE_ACCEPT(accept_mutex_on());
 
         for (;;) {
 	    if (listeners->next != listeners) {
@@ -2445,7 +2692,7 @@
 		child_exit_modules(pconf, server_conf);
         }
 
-        accept_mutex_off(); /* unlock after "accept" */
+        SAFE_ACCEPT(accept_mutex_off()); /* unlock after "accept" */
 
 	/* We've got a socket, let's at least process one request off the
 	 * socket before we accept a graceful restart request.
@@ -2787,7 +3034,7 @@
 	init_modules (pconf, server_conf);
 	open_logs (server_conf, pconf);
 	set_group_privs ();
-	accept_mutex_init (pconf);
+	SAFE_ACCEPT(accept_mutex_init (pconf));
 	if (!is_graceful) {
 	    reinit_scoreboard(pconf);
 	}
@@ -2928,6 +3175,8 @@
 	    log_error ("SIGHUP received.  Attempting to restart", server_conf);
 	}
 	++generation;
+
+	SAFE_ACCEPT(accept_mutex_cleanup());
 
     } while (restart_pending);
 
Index: httpd.h
===================================================================
RCS file: /export/home/cvs/apache/src/httpd.h,v
retrieving revision 1.137
diff -u -r1.137 httpd.h
--- httpd.h	1997/08/05 06:02:43	1.137
+++ httpd.h	1997/08/08 10:36:55
@@ -157,8 +157,8 @@
 
 /* Define this to be what type you'd like returned for files with unknown */
 /* suffixes */
-#ifndef DEFAULT_TYPE
-#define DEFAULT_TYPE "text/plain"
+#ifndef DEFAULT_CONTENT_TYPE
+#define DEFAULT_CONTENT_TYPE "text/plain"
 #endif
 
 /* Define this to be what your per-directory security files are called */



Mime
View raw message