httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dean gaudet <dgaudet-list-new-ht...@arctic.org>
Subject Re: mod_file_cache performance
Date Tue, 03 Jul 2001 04:16:38 GMT


On Mon, 2 Jul 2001, Brian Pane wrote:

> dean gaudet wrote:
> [...]
>
> >a mostly optimal syscall sequence for responses to a keep-alive
> >connection on linux should look something like:
> >
> >	sockfd = accept();
> >	fcntl(sockfd, F_SETFL, O_NDELAY)
> >	setsockopt(sockfd, TCP_CORK = 1)
> >	while (1) {
> >		rc = read(sockfd);
> >		if (rc <= 0) {
> >			save_errno = errno;
> >			/* send any remaining packets now */
> >			setsockopt(sockfd, TCP_CORK = 0);
> >			if (rc == 0) break;
> >			if (save_errno == EAGAIN) {
> >				poll(until we can read sockfd);
> >				continue;
> >			}
> >			/* log error */
> >			break;
> >		}
> >		/* parse request */
> >		respfd = open(response_file);
> >		write(sockfd, response headers)
> >		sendfile(sockfd, respfd)
> >		close(respfd)
> >	}
> >	close(sockfd);
> >
>
> The current 2.0 httpd does basically this, except that it resets the
> TCP_CORK and TCP_NODELAY flags (to 'off' and 'on,' respectively)
> after each sendfile call (rather than just when it gets EAGAIN on a
> read).  This seems like a bug.  Is there some context in which the resetting
> of these flags after every sendfile call is really necessary?

there should be no reason to tweak TCP_NODELAY except at socket creation
time.

TCP_CORK MUST be popped when a read() would block.

TCP_CORK MAY be popped when a request is taking "too long" in the parsing
stage prior to generating any output.  this is a really hard thing to do
efficiently (and apache doesn't attempt it yet), and i've argued it's a
mistake in TCP_CORK and there's an alternative API that's been proposed
for it.

basically in this last case you want to think about a pipelined connection
in which requests A and B arrive in the same packet.  request A is a
static request, and request B is a dynamic request that takes several
seconds to generate its output.  the last packet of request A will be
delayed until request B begins its output.  (it's even more complicated
when you start thinking about WebMUX for HTTP/ng.)

the ideal solution is a more correct variation of nagle:  the kernel
provides an explicit "flush incomplete packets now" interface.  in this
scenario you leave nagle enabled on the socket, and you signal a flush
when a read() would block.  otherwise you let nagle deal with the
incomplete packet.  this gives the timeout packet flush when request
parsing is taking "too long".  (assuming you always write() or sendfile()
after each response.)

that way you take advantage of all the lightweight timing gear which
already needs to be in the kernel for the TCP stack... and avoid
duplicating the machinery in userland.  (although i guess there's other
hacks to do this... but i'm rambling now :)

-dean


Mime
View raw message