httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cliff Skolnick <cl...@organic.com>
Subject Re: WWW Form Bug Report: "accept() protocol errors hangs server" on Solaris 2.x (fwd)
Date Mon, 11 Mar 1996 21:58:53 GMT

Arg...this may be a solaris bug, I guess my patch did not work.  I think 
I will just have the child die when the EPROTO happens.  ARG :(
I am also goign to make sure this gets filed as a bug...

--
Cliff Skolnick                                      cliff@organic.com

"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety." -- Benjamin Franklin, 1759

---------- Forwarded message ----------
Date: Mon, 11 Mar 1996 12:26:35 +0100 (MET)
From: Torbjorn.wictorin@its.uu.se
To: Cliff Skolnick <cliff@organic.com>
Subject: Re: WWW Form Bug Report: "accept() protocol errors hangs server" on Solaris 2.x

The patch you sent me does not look like my source code...
There is no  'else' above the while in my version (1.0.3).
Is there a 1.0.4 somewhere?

Also, the patch I proposed does not seem to work either.
Seems that there must be some sort of lock other than the accept_mutex
lock that has to be cleaned up before an exit.


Torbjörn Wictorin
Uppsala Universitet, ITS, Box 887, S-751 08 Uppsala, Sweden
+46 18 18 77 33
torbjorn.wictorin@its.uu.se

On Fri, 8 Mar 1996, Cliff Skolnick wrote:

> 
> 
> I am just about to submit a patch for this...I found the same problem a 
> couple weeks ago.  Give this a try and see if it fixes the problem.  If 
> it does please email me and I will integrate the fix into the next release.
> 
> I am worried that this will not fix your problem, since our server never 
> hung, so please do email me after your test.  I'd like to get this 
> problem fixed.
> 
> Thanks,
>   Cliff
> 
> 
> *** http_main.c	Tue Feb 27 00:10:20 1996
> --- http_main.c-orig	Fri Mar  8 00:26:00 1996
> ***************
> *** 906,927 ****
>   	    }
>   	} else
>   	    while ((csd=accept(sd, &sa_client, &clen)) == -1) {
> - #ifdef SOLARIS2
> - /*
> -  * For solaris 2.x, where x < 6  (and possibly other OSs without in kernel
> -  * sockets)
> -  *
> -  * This one is caused when the "ESTABLISHED" connection is in the kernel but
> -  * receves a RST before the library accept() has completed the accept [
> -  * received T_CONN_IND but not yet returned a T_CONN_RES message ]. The
> -  * workaround for the EPROTO error is to ignore it and retry accept().
> -  */
> - 		if ((errno != EINTR) && (errno != EPROTO))
> - 		    log_error("socket error: accept failed", server_conf);
> - #else /* SOLARIS2 */
>   		if (errno != EINTR)
>   		    log_error("socket error: accept failed", server_conf);
> - #endif /* SOLARIS2 */
>   	    }
>   
>   	accept_mutex_off(); /* unlock after "accept" */
> --- 906,913 ----
> 
> 
> On Thu, 7 Mar 1996 torbjorn.wictorin@its.uu.se wrote:
> 
> > Submitter: torbjorn.wictorin@its.uu.se
> > Operating system: Solaris 2.x, version: 2.4
> > Extra Modules used: 
> > URL exhibiting problem: 
> > 
> > Symptoms:
> > --
> > Under heavy load, the accept() call somtimes
> > returns with EPROTO.
> > This is logged by the server
> > Thereafter it loops back and does a new accept().
> > 
> > When as many EPROTO has occured as there are
> > subtasks, apache hangs.
> > 
> > Changed the code so that it does the following:
> > 
> > 	free the mutex semaphore
> > 	exit(0)
> > 
> > Now it seems to work.
> > 
> > Of course the real error is something in the
> > solaris kernel. Don't know what. Do you have any ideas?
> > 
> > /Torbjorn
> >  
> > --
> > 
> > Backtrace:
> > --
> > 
> > --
> > 
> 
> --
> Cliff Skolnick                                      cliff@organic.com
> 
> "They that can give up essential liberty to obtain a little temporary
> safety deserve neither liberty nor safety." -- Benjamin Franklin, 1759
> 
> 


Mime
View raw message