Return-Path: Delivered-To: apmail-cocoon-dev-archive@www.apache.org Received: (qmail 48647 invoked from network); 11 Feb 2007 15:48:07 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 11 Feb 2007 15:48:06 -0000 Received: (qmail 78780 invoked by uid 500); 11 Feb 2007 15:48:13 -0000 Delivered-To: apmail-cocoon-dev-archive@cocoon.apache.org Received: (qmail 78708 invoked by uid 500); 11 Feb 2007 15:48:12 -0000 Mailing-List: contact dev-help@cocoon.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: Reply-To: dev@cocoon.apache.org List-Id: Delivered-To: mailing list dev@cocoon.apache.org Received: (qmail 78697 invoked by uid 99); 11 Feb 2007 15:48:12 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 11 Feb 2007 07:48:12 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: local policy) Received: from [130.237.222.115] (HELO smtp.nada.kth.se) (130.237.222.115) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 11 Feb 2007 07:48:01 -0800 X-Authentication-Info: The sender was authenticated as danielf using GSSAPI at smtp.nada.kth.se Received: from [130.237.218.93] (cvap80.nada.kth.se [130.237.218.93]) (authenticated bits=0) by smtp.nada.kth.se (8.12.11.20060308/8.12.11) with ESMTP id l1BFldat015971 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sun, 11 Feb 2007 16:47:39 +0100 (MET) Message-ID: <45CF3A97.5000604@nada.kth.se> Date: Sun, 11 Feb 2007 16:47:35 +0100 From: Daniel Fagerstrom User-Agent: Thunderbird 1.5.0.9 (X11/20061215) MIME-Version: 1.0 To: dev@cocoon.apache.org Subject: Re: Improving ServletConnection to make it cache-aware References: <45CDFF98.70900@tuffmail.com> In-Reply-To: <45CDFF98.70900@tuffmail.com> Content-Type: text/plain; charset=ISO-8859-2; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Grzegorz Kossakowski wrote: > Hello, > > I would like to discuss making ServletConnection cache-aware. Daniel > suggested earlier to utilize standard HTTP protocol concepts. I > totally agree with his opinion and would like to propose solution, but > first let's discuss requirements. > > Requirements > ============ > > Requirements that ServletConnection must meet are really simple: > 1. ServletConnection should provide data that can be used for > constructing small validation object. > 2. ServletConnection should expose functionality for checking if > previous response is still valid taking as input validation object only. > 3. We would like ServletConnection to make as few as possible round > trips in every situation it encounters. > > To satisfy these requirements I propose to use concept of HTTP > conditional gets[1], more precisely If-Modified-Since > request-header[2] field. This way we have following cases: > * ServletConnection does not have information needed to create > If-Modified-Since header, but response includes Last-Modified header > and full content. Validity object can be created. > * ServletConnection does not have information needed to create > If-Modified-Since header and response does not include Last-Modified > header but includes full content. Validity object cannot be created. > * ServletConnection does have information needed to create > If-Modified-Since. Resource has not been modified so 302 status code > is returned as response and response does not include full content. > Thus ServletConnection can just tell that content is still valid and > can be fetched from cache. > * ServletConnection does have information needed to create > If-Modified-Since. Resource has been modified so 200 status code is > returned as response and response includes full content. > ServletConnection tells that cached content is invalid and returns > fresh content. > > Requirements are satisfied: > 1. Last-Modified header can be used to construct validation object. > 2. Taking date from validation object enables ServletConnection to > formulate conditional GET and then response HTTP code settles if > resource is still valid. > 3. In every case we have only one round trip. I agree with everything this far, it would also be nice to add ETag handling to it. The idea is that the servlet-service-fw should work with all kinds of servlets. And using Last-Modified and ETag headers are the two main ways to handle caching for HTTP, so by supporting both we make caching work for a larger share of the servlets. But the first priority is of course to make it work with the SitemapServlet. Using ETags together with If-None-Match is analogous to use Last-Modified together with If-Modified-Since as you described above. Some extra care is needed if the servlet called from the ServletConnection returns both an ETag and a Last-Modified header. > Implementation proposal > ======================= > > We should start from making pipelines more HTTP-compliant. This > demands taking If-Modified-Since headers into account and returning > appropriate status code when caching pipeline is processed. Behavior > of non-caching pipelines should not change. Agree. There is some getLastModified info on the cachedResponse object in the AbstractCachingProcessingPipeline. It doesn't seem like it is used for setting the Last-Modified header or used together with the If-Modified-Since header however. Also it might be that one could use the SourceValidity object (or maybe a hash key based on it) as an ETag. > Then we should implement setIfModifiedSince and getIfModifiedSince > from java.net.URLConnection and construct requests according to value > of that property. Also getResponseCode method should be implemented. > > All changes proposed above will enable us to implement source > validation of ServletSource very easily. > > Comments? Thoughts? Seem like the right direction to me. > I can start implementing this as soon as we came with agreement on > this. However, I would like to point out that I'll need some support > to make changes in pipeline stuff. I've taken a look on code and not > everything seems to be clear. Any volunteer on the board? ;-) I can't say that the pipeline code is entirely clear to me either. Maybe some of the original authors are still around? > Last remark. I know that my English is quite poor and it could be that > I do not express my thoughts clearly enough. I'm really working on it > and you should not hesitate to ask when something is hard to understand. Don't worry about that. I don't have any problem to understand what you write. As soon as I had learned a little bit more about the HTTP protocol it was perfectly clear ;) /Daniel