www-apache-bugdb mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dean Gaudet <dgau...@arctic.org>
Subject Re: general/885: After a period of time (not found to coincide with server rehashes or any specific access), the server will read requests, but return no data (and close the connection). It will still respond to a server-status request though.
Date Sat, 19 Jul 1997 20:50:01 GMT
The following reply was made to PR general/885; it has been noted by GNATS.

From: Dean Gaudet <dgaudet@arctic.org>
To: Illuminatus Primus <vermont@gate.net>
Subject: Re: general/885: After a period of time (not found to coincide with server rehashes
or any specific access), the server will read requests, but return no data (and close the
connection).  It will still respond to a server-status request though.
Date: Sat, 19 Jul 1997 13:44:44 -0700 (PDT)

 
 On Sat, 19 Jul 1997, Illuminatus Primus wrote:
 
 > (wasn't sure if i should cc apache-bugdb)..
 
 It helps keep an audit trail of the report.  But no biggie.
 
 > I grabbed another server-status (strange that that works, but retrieving
 > documents doesn't), and it's pretty much like the last one..
 
 Well server-status doesn't require it to access any file, which is why
 I was wondering if you used NFS.
 
 > > Do you use NFS on these servers?
 > 
 > Yes.. but Apache and none of the files it would access are mounted from a
 > remote host (the user directories are exported however)..
 
 So it's an NFS server and users change files from remote, but absolutely
 nothing apache needs is nfs mounted on the webserver itself?
 
 > No, logs are piped into a small c program i wrote to filter logs into
 > different log files based on username (thanks to logformat this was easy).
 
 log piping isn't terribly reliable in any version of apache at the moment...
 but a broken pipe would affect server-status just as well.  (We have a
 design for better piping, I hope to implement it in the next few weeks.)
 
 > The log splitter does have internal numeric ip resolution (with some heavy
 > duty caching) and before I set the lookup timeout length to 2 seconds it
 > could lock up the server when someone accessed it from an ip that was
 > hosted by an unresponsive name server.. However, server status would show
 > the server full of requests, and accesses would hang instead of returning
 > null pages when this happened... I haven't seen it happen again since the
 > timeout code was added to the log splitter.
 
 You could consider doing the lookups asynchronously.  The key thing is
 to empty your incoming pipe as fast as possible.
 
 > I'm running ISS #4, which I've heard many good reports about.. I've held
 > off upgrading to 2.0.30 due to small bugs Ive heard are lurking in it
 > (missing sysctl for one).. I might consider upgrading to a 2.0.31
 > prepatch..
 
 Unless you use sysctl it's not a biggie... and there's a patch for it
 somewhere on www.linuxhq.com.  I tune everything via /proc.  I've got
 pre-2.0.31-2 on a few machines.  Don't go to it unless your machine is
 a dedicated web server with enough RAM to avoid ever swapping.
 
 > > What happens if you add -DNO_SLACK to EXTRA_CFLAGS in your
 > > Configuration and rebuild?
 > 
 > I just checked and there is no NO_SLACK option for 1.2b11.. and the same
 > thing just happened for that version too.  Or is slack enabled by default
 > in 1.2b11 and there simply isnt an option to disable it?
 
 No the slack code didn't exist until 1.2.1.  So don't worry about it,
 you've eliminated it as a problem.
 
 If you have LOTS of disk to spare you could run strace on the parent
 with -f -ff to do full tracing of all children.  Alternately you
 could wait until the server is in a hung state then try to run
 strace -p against (up to) 32 of the children and hope to catch some
 useful trace info.  In any event, should you do one of these, then
 I probably would only need the tail 30 or 40 lines of tracing on each
 pid ... please don't send me a gig of traces ;)
 
 It would really help if you could get a smaller reproduceable example.
 
 Is there anything else weird about the machines when the lockup happens?
 Any kernel messages?  Do you have to hard-restart the machine or just
 restart apache?
 
 Any possibility of logging direct to disk and running something like
 "tail -f access_log | my_pipe_logger_program" for a while?
 
 Dean
 

Mime
View raw message