From Rainer Jung <>
Subject Re: Problem with file descriptor handling in httpd 2.3.1
Date Sun, 04 Jan 2009 15:55:53 GMT
On 04.01.2009 16:22, Rainer Jung wrote:
> On 04.01.2009 15:56, Ruediger Pluem wrote:
>> On 01/04/2009 03:48 PM, Rainer Jung wrote:
>>> On 04.01.2009 15:40, Ruediger Pluem wrote:
>>>> On 01/04/2009 03:26 PM, Rainer Jung wrote:
>>>>> On 04.01.2009 14:14, Ruediger Pluem wrote:
>>>>>> On 01/04/2009 11:24 AM, Rainer Jung wrote:
>>>>>>> On 04.01.2009 01:51, Ruediger Pluem wrote:
>>>>>>>> On 01/04/2009 12:49 AM, Rainer Jung wrote:
>>>>>>>>> On 04.01.2009 00:36, Paul Querna wrote:
>>>>>>>>>> Rainer Jung wrote:
>>>>>>>>>>> During testing 2.3.1 I noticed a lot of errors
of type EMFILE:
>>>>>>>>>>> "Too
>>>>>>>>>>> many open files". I used strace and the problem
looks like this:
>>>>>>>>>>> - The test case is using ab with HTTP keep alive,
concurrency 20
>>>>>>>>>>> and a
>>>>>>>>>>> small file, so doing about 2000 requests per
>>>>>>>> What is the exact size of the file?
>>>>>>> It is the index.html, via URL /, so size is 45 Bytes.
>>>>>> Can you try if you run in the same problem on 2.2.x with a file of
>>>>>> size 257 bytes?
>>>>> I tried on the same type of system with event MPM and 2.2.11. Can't
>>>>> reproduce even with content file of size 257 bytes.
>>>> Possibly you need to increase the number of threads per process with
>>>> event MPM
>>>> and the number of concurrent requests from ab.
>>> I increased the maximum KeepAlive Requests and the KeepAlive timeout a
>>> lot and during a longer running test I see always exactly as many open
>>> FDs for the content file in /proc/PID/fd as I had concurrency in ab. So
>>> it seems the FDs always get closed before handling the next request in
>>> the connection.
>>> After testing the patch, I'll try it again with 257 bytes on 2.2.11 with
>>> prefork or worker.
>> IMHO this cannot happen with prefork on 2.2.x. So I guess it is not
>> worth testing.
>> It still confuses me that this happens on trunk as it looks like that
>> ab does not
>> do pipelining.
> ^The strace log shows, that the sequence really is
> - new connection
> - read request
> - open file
> - send response
> - log request
> repeat this triplet a lot of times (maybe as long as KeepAlive is
> active) and then there are a lot of close() for the content files. Not
> sure, about the exact thing that triggers the close.
> So I don't necessarily see pipelining (in the sense of sending more
> requests before responses return) being necessary.
> I tested your patch (worker, trunk): It does not help. I then added an
> error log statement directly after the requests++ and it shows this
> number is always "1".

I can now even reproduce without load. Simply open a connection and send 
hand crafted KeepAlive requests via telnet. The file descriptors are 
kept open as long as the connection is alive. I'll run under the 
debugger to see, how the stack looks like, when the file gets closed.

Since the logging is done much earlier (directly after eahc request) the 
problem does not seem to be directly related to EOR. It looks like 
somehow the close file cleanup does not run when the request pool is 
destroyed or maybe it is registered with the connection pool. gdb should 

More later.


