Return-Path: Delivered-To: apmail-httpd-dev-archive@www.apache.org Received: (qmail 51211 invoked from network); 30 Oct 2006 12:45:16 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 30 Oct 2006 12:45:16 -0000 Received: (qmail 96959 invoked by uid 500); 30 Oct 2006 12:45:23 -0000 Delivered-To: apmail-httpd-dev-archive@httpd.apache.org Received: (qmail 96915 invoked by uid 500); 30 Oct 2006 12:45:23 -0000 Mailing-List: contact dev-help@httpd.apache.org; run by ezmlm Precedence: bulk Reply-To: dev@httpd.apache.org list-help: list-unsubscribe: List-Post: List-Id: Delivered-To: mailing list dev@httpd.apache.org Received: (qmail 96881 invoked by uid 99); 30 Oct 2006 12:45:23 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 30 Oct 2006 04:45:23 -0800 X-ASF-Spam-Status: No, hits=0.7 required=10.0 tests=SUBJ_HAS_SPACES X-Spam-Check-By: apache.org Received-SPF: neutral (herse.apache.org: local policy) Received: from [80.229.52.226] (HELO grimnir.webthing.com) (80.229.52.226) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 30 Oct 2006 04:45:10 -0800 Received: from grimnir (localhost [127.0.0.1]) by grimnir.webthing.com (Postfix) with ESMTP id B44BA2136 for ; Mon, 30 Oct 2006 12:44:46 +0000 (GMT) Date: Mon, 30 Oct 2006 12:44:46 +0000 From: Nick Kew To: dev@httpd.apache.org Subject: Re: svn commit: r468373 - in /httpd/httpd/trunk: CHANGES modules/cache/mod_cache.c modules/cache/mod_cache.h modules/cache/mod_disk_cache.c modules/cache/mod_disk_cache.h modules/cache/mod_mem_cache.c Message-ID: <20061030124446.0e1fc997@grimnir> In-Reply-To: <30068.196.8.104.27.1162209783.squirrel@www.sharp.fm> References: <20061027132858.5098B1A9846@eris.apache.org> <20061029232321.GC2515@scotch.ics.uci.edu> <45453EEE.9090308@sharp.fm> <20061030105759.1c6a1fa7@grimnir> <30068.196.8.104.27.1162209783.squirrel@www.sharp.fm> Organization: WebThing X-Mailer: Sylpheed-Claws 2.5.0-rc3 (GTK+ 2.10.6; i486-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org On Mon, 30 Oct 2006 14:03:03 +0200 (SAST) "Graham Leggett" wrote: > On Mon, October 30, 2006 12:57 pm, Nick Kew wrote: > > >> The current expectation that it be possible to separate completely > >> the storing of the cached response and the delivery of the content > >> is broken. > > > > Why is that? > > Because: > > - the cache_body() hook is expected to swallow an entire brigade > completely and write it to cache completely before this brigade is > written to the network. > > In the case of files, that means one brigade, containing one bucket, > containing one entire file. For a 4.7GB DVD ISO file, that means many > minutes before the response starts arriving at the client, which has > timed out at this point. Hang on! Where's the file coming from? If it's local and static, what is mod_cache supposed to gain you? And if not, what put it in a (single) file bucket before it reached mod_cache? > - apr_bucket_read() assumes that a bucket will only ever be read once. > > In so doing, it may morph buckets into heap buckets while reading, > when buckets are too large to be read in one go. This behaviour is > undocumented (I plan to fix that). Yes. But what is reading them? > If these heap buckets are not immediately deleted, they will last the > lifetime of a request. They are not deleted in mod_disk_cache because > later, we need to write these same buckets to the network. Out of > memory ensues. If mod_disk_cache gets a single file bucket as input, does it actually need to read the file? It can send the file bucket down the chain as-is, having given it a filesystem entry in cache space. OK, that falls down if the cache's filespace is not on the same disc as the file bucket. But that in itself is a major overhead, and begs my first question: what is mod_cache supposed to gain? Mod_cache fronting a jukebox? Right, then you do want to copy the file: can't the cache filter itself pass buckets as it reads them? Of course it can. But just because this case exists doesn't mean the cache filter should insist on reading every file bucket it gets! OK, how about this for an alternative: introduce an apr_bucket_clone method, that works by reference-counting and lazy copying, and in the case of a file bucket, asynchronous copying. The filter can clone the bucket, pass one copy on immediately, and save the other: then the save will actually read the file if and only if it's copying between filesystems, and the filter chain can use sendfile. I haven't thought this through: I put it forward as the kind of proposal that might fix the problem without breaking Justin's expectations. > Previous discussion is just noise, it would be better if I explain > again. :-) > > That suggests broken or implementation and/or inappropriate usage. > > It says nothing about expectation. > > Sorry, but when Google buys YouTube for a Googol dollars, the argument > that nobody wants to serve large files makes no sense. Nobody said that. > The existing mod_cache, regardless of configuration, and regardless of > cache disk size, can under no circumstances cache a file bigger than > available RAM. > > This is well and truly broken. Really? So if a DVD image comes in 8K chunks from mod_proxy, mod_cache is going to buffer everything? Erm .... why? Are you saying mod_cache enforces that? Or mod_disk_cache? In the latter case, there's always the option of introducing a new provider for large files. > The patches were posted to this dev list a long time ago, and nobody > took any time to review them. I see no reason why anybody is going to > review patches going in on some parallel dev branch either. OK, I plead guilty to not reviewing them. Did you motivate review by accompanying them with an explanation (as above) of what brokenness they fixed? -- Nick Kew