Return-Path: Delivered-To: apmail-new-httpd-archive@apache.org Received: (qmail 82202 invoked by uid 500); 1 Jun 2001 18:45:41 -0000 Mailing-List: contact new-httpd-help@apache.org; run by ezmlm Precedence: bulk Reply-To: new-httpd@apache.org list-help: list-unsubscribe: list-post: Delivered-To: mailing list new-httpd@apache.org Received: (qmail 81981 invoked from network); 1 Jun 2001 18:45:37 -0000 Date: Fri, 1 Jun 2001 11:47:59 -0700 (PDT) From: X-Sender: To: Subject: Re: file/mmap buckets, subrequests, pools, 2.0.18 In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Rating: h31.sny.collab.net 1.6.2 0/1000/N On 1 Jun 2001, Jeff Trawick wrote: > writes: > > > On Fri, 1 Jun 2001, Jeff Trawick wrote: > > > > > Greg Ames put together a 2.0.18 build for the purposes of running on > > > apache.org but we were never able to switch over to it from 2.0.16 > > > because of a hard failure with certain types of requests. > > > > > > The problem, trying to simplify a bit: > > > > > > In a subrequest, default_handler opens a file using the pool > > > associated with the subrequest and creates a file bucket. > > > > > > After the subrequest is over, we still have the file bucket with the > > > apr_file_t that had the now-destroyed pool. Any references to the > > > apr_file_t are broken because the pool was destroyed. > > > > > > We finally died back in the main request processing because we > > > referenced an mmap bucket whose apr_mmap_t used the same pool as the > > > apr_file_t. The apr_mmap_t had been overlaid because the storage was > > > released at the end of the main request and subsequently reused for > > > some other purpose. > > > > > > We have never hit this problem with 2.0.16 and we always hit this > > > problem with 2.0.18, leading me to believe that the problem was > > > affected by recent code changes, even though it seems to be a basic > > > design issue. > > > > > > Any ideas about what recent changes might have introduced or exposed > > > this problem? > > > > This is realitively simple. A while ago, I changed the default handler to > > use the connection pool to fix this problem. A couple of months ago, Dean > > pointed out that this was a major resource leak. After 2.0.16, somebody > > (Roy?) pointed out that this was a pretty big problem when serving a lot > > of very large files on the same connection. > > > > The solution was a simple loop at the end of the core_output_filter that > > reads in the data from the file to memory. This is okay to do, because we > > are garuanteed to have less than 9k of data. It sounds like the problem > > is that we don't read in the data if we have an MMAP, or we may not be > > getting into the loop on sub-requests. > > By "simple loop" do you mean the call to ap_save_brigade()? > > I haven't verified that we even get to that point on a subrequest, but > my current guess is that we get to ap_save_brigade() but the setaside > doesn't do anything for the mmap bucket because mmaps aren't normally > transient. Okay, I had a chance to look at the code, instead of going from memory. The code that is causing the problem is: if ((!fd && !more && (nbytes + flen < AP_MIN_BYTES_TO_WRITE) && !APR_BUCKET_IS_FLUSH(e)) || (nbytes + flen < AP_MIN_BYTES_TO_WRITE && APR_BUCKET_IS_EOS(e) && c->keepalive)) { /* NEVER save an EOS in here. If we are saving a brigade with * an EOS bucket, then we are doing keepalive connections, and * we want to process to second request fully. */ if (APR_BUCKET_IS_EOS(e)) { apr_bucket *bucket = NULL; /* If we are in here, then this request is a keepalive. We * need to be certain that any data in a bucket is valid * after the request_pool is cleared. */ if (ctx->b == NULL) { ctx->b = apr_brigade_create(f->c->pool); } APR_BRIGADE_FOREACH(bucket, b) { const char *str; apr_size_t n; rv = apr_bucket_read(bucket, &str, &n, APR_BLOCK_READ); /* This apr_brigade_write does not use a flush function because we assume that we will not write enough data into it to cause a flush. However, if we *do* write "too much", then we could end up with transient buckets which would suck. This works for now, but is a bit shaky if changes are made to some of the buffering sizes. Let's do an assert to prevent potential future problems... */ AP_DEBUG_ASSERT(AP_MIN_BYTES_TO_WRITE <= APR_BUCKET_BUFF_SIZE); apr_brigade_write(ctx->b, NULL, NULL, str, n); } apr_brigade_destroy(b); } else { ap_save_brigade(f, &ctx->b, &b); } return APR_SUCCESS; } This is meant to solve the problem, but we are only going into the loop that solves the problem if we see an EOS bucket. Of course, the sub-request filter is removing the EOS bucket. So, the solution IMO, is to do this same loop in the subreq_output_filter. Ryan _______________________________________________________________________________ Ryan Bloom rbb@apache.org 406 29th St. San Francisco, CA 94131 -------------------------------------------------------------------------------