httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dgau...@hotwired.com (Dean Gaudet)
Subject Re: irix 5.3 and 1.1.1
Date Wed, 21 Aug 1996 21:02:03 GMT
I upgraded the RAM in the machines and the problem died down mostly.
There's other crap to blame on my side before I really dig into Apache...
unfortunately today looks to be lower volume than yesterday so I can't
test how the other changes I put in yesterday affected things.  I just
wanted to be sure I hadn't missed another related patch.

Dean

In article <hot.mailing-lists.new-httpd-199608211929.PAA09421@telebase.com>,
Chuck Murcko  <new-httpd@hyperreal.com> wrote:
>Are we getting an unexpected EINVAL or something like it that's messing
>up the mutex operation? We just found something like that here with
>some Solaris software we'd written.
>
>One possible scenario that could cause this is a bad pointer stepping
>on lock_it or unlock_it in the fcntl() calls.
>
>Dean Gaudet liltingly intones:
>> 
>> I think I'm running into the children-not-dying problem on irix 5.3
>> under 1.1.1.  I applied a patch from Ben that I thought dealt with this,
>> but it doesn't seem to be working.  Have I missed another related patch?
>> I'll include (part of) Ben's patch below for reference (revision numbers
>> are mine, not hyperreal's).
>> 
>> The sympton is:  the machine's load shoots up to 27+ at which point a
>> monitoring script I have running cuts in and kills the webserver and
>> restarts it.
>> 
>> Dean
>> 
>> Index: http_main.c
>> ===================================================================
>> RCS file: /hot/repository/apache/src/http_main.c,v
>> retrieving revision 1.16
>> retrieving revision 1.17
>> diff -c -r1.16 -r1.17
>> *** http_main.c	1996/08/02 07:19:49	1.16
>> --- http_main.c	1996/08/02 07:23:10	1.17
>> ***************
>> *** 845,853 ****
>>   #endif
>>   }
>>   
>>   int wait_or_timeout (int *status)
>>   {
>> !     wait_or_timeout_retval = -1;
>>       
>>   #if defined(NEXT)
>>       if (setjmp(wait_timeout_buf) != 0) {
>> --- 845,874 ----
>>   #endif
>>   }
>>   
>> + #ifdef BROKEN_WAIT
>> + /*
>> + Some systems appear to fail to deliver dead children to wait() at times.
>> + This sorts them out.
>> + */
>> + void reap_children()
>> +     {
>> +     int status,n;
>> + 
>> +     for(n=0 ; n < HARD_SERVER_LIMIT ; ++n)
>> + 	if(scoreboard_image->servers[n].status != SERVER_DEAD
>> + 	   && waitpid(scoreboard_image->servers[n].pid,&status,WNOHANG)
== -1
>> + 	   && errno == ECHILD)
>> + 	    {
>> + 	    sync_scoreboard_image();
>> + 	    update_child_status(n,SERVER_DEAD,NULL);
>> + 	    }
>> +     }
>> + #endif
>> + 
>>   int wait_or_timeout (int *status)
>>   {
>> !     int wait_or_timeout_retval = -1;
>> !     static int ntimes;
>>       
>>   #if defined(NEXT)
>>       if (setjmp(wait_timeout_buf) != 0) {
>> ***************
>> *** 857,863 ****
>>   	errno = ETIMEDOUT;
>>   	return wait_or_timeout_retval;
>>       }
>> !     
>>       signal (SIGALRM, longjmp_out_of_alarm);
>>       alarm(1);
>>   #if defined(NEXT)
>> --- 878,890 ----
>>   	errno = ETIMEDOUT;
>>   	return wait_or_timeout_retval;
>>       }
>> ! #ifdef BROKEN_WAIT
>> !     if(++ntimes == 60)
>> ! 	{
>> ! 	reap_children();
>> ! 	ntimes=0;
>> ! 	}
>> ! #endif
>>       signal (SIGALRM, longjmp_out_of_alarm);
>>       alarm(1);
>>   #if defined(NEXT)
>> 
>
>chuck
>Chuck Murcko	N2K Inc.	Wayne PA	chuck@telebase.com
>And now, on a lighter note:
>Our OS who art in CPU, UNIX be thy name.
>	Thy programs run, thy syscalls done,
>	In kernel as it is in user!



Mime
View raw message