Return-Path: Delivered-To: apmail-cocoon-dev-archive@www.apache.org Received: (qmail 12867 invoked from network); 16 Feb 2007 15:20:48 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 16 Feb 2007 15:20:47 -0000 Received: (qmail 33837 invoked by uid 500); 16 Feb 2007 15:20:54 -0000 Delivered-To: apmail-cocoon-dev-archive@cocoon.apache.org Received: (qmail 33511 invoked by uid 500); 16 Feb 2007 15:20:53 -0000 Mailing-List: contact dev-help@cocoon.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: Reply-To: dev@cocoon.apache.org List-Id: Delivered-To: mailing list dev@cocoon.apache.org Received: (qmail 33495 invoked by uid 99); 16 Feb 2007 15:20:52 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Feb 2007 07:20:52 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of peter.hunsberger@gmail.com designates 64.233.182.190 as permitted sender) Received: from [64.233.182.190] (HELO nf-out-0910.google.com) (64.233.182.190) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Feb 2007 07:20:42 -0800 Received: by nf-out-0910.google.com with SMTP id l23so1533058nfc for ; Fri, 16 Feb 2007 07:20:20 -0800 (PST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=Pyd5g9y4iyim4U503zx4YXxARhwPRvWOsbZB0cW7t306x4z9NkAfpqcsfbxIFpd/VARkWWy3bIZbphmJvtELg8rEK8gOjm8AjDxGK/mOn2x+awf16O1f/d7ou4ZJYWW4rXKkLCHRwWFK4FQLUZMHUblFUxu1Nw955QQrdlcZo4g= Received: by 10.82.107.15 with SMTP id f15mr4911555buc.1171639220644; Fri, 16 Feb 2007 07:20:20 -0800 (PST) Received: by 10.82.126.12 with HTTP; Fri, 16 Feb 2007 07:20:20 -0800 (PST) Message-ID: Date: Fri, 16 Feb 2007 09:20:20 -0600 From: "Peter Hunsberger" To: dev@cocoon.apache.org Subject: Re: Postable source and servlet services problem In-Reply-To: <45D5AB4E.2090102@tuffmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <45D39571.30008@tuffmail.com> <45D48D49.30207@tuffmail.com> <45D4CC8A.2020508@tuffmail.com> <45D5AB4E.2090102@tuffmail.com> X-Virus-Checked: Checked by ClamAV on apache.org On 2/16/07, Grzegorz Kossakowski wrote: > Firstly, I do not understand how PRG pattern relates to the problem > whether POST request are cacheable or not. In this pattern, the implementation that is truest to the original HTTP standards intention is for every POST request to be answered with a 303 status response. The browser in turn issues a GET. No data is ever returned directly returned in response to the POST. From what I see you don't think there is a problem caching the response to a GET? > Secondly, in order to get > full understanding of your arguments I would like to ask you how this > pipeline would be cached: > > > > > > > Suppose that we have this http_post generator that parser (as XML) the > body of POST request. Of course this pipeline will work correctly only > for POST requests so suppose we have one. My question is: > How cache key and cache validity object could be created for this kind > of generator? Could please provide quite detailed description as I would > like to understand this issue. It's generated in exactly the same way as any other cache key. It depends completely on the internal implementation of the generator and transformer(s) and what they consider to affect the cacheability of the results they produce. Consider first the case of a search form where no data is present in the initial presentation of the form. The only requirement here is that the form can be uniquely identified with respect to the cache key (for example "patient.search" to take an example from our system). Now consider the same form that has been filled out by the user but that has errors on submission and has to be redisplayed. The final result, is generally not cached; the combination of the form and the data to be presented is essentially a unique instance and there is no point in caching it. On the Cocoon side you have a couple of ways to handle this. In our case, the basic form (with no data) is generated in exactly the same way in both cases with the same cache key. However we aggregate that result with another generator that generates any data to be presented within the form. When no data is present a constant cache key is generated and a simple SAX wrapper around what is essentially a null result is cached (which may be used across many different form combination). When data is present this particular generator always returns null cache key and the data is not cached (or the key points to a validity that will return false for the validity check). The results of the cached form and the sometimes cached data now have an aggregate cache key, in one case it is valid and everything in the pipeline can be cached. In the other case the aggregate key is not valid and the final results of the pipeline are not cached (even though partial SAX streams inside the pipeline are). If a user POSTs an empty search form the pipeline might produce exactly the same results as the original GET that first generated the form; it's not the GET or POST that determined the cacheability, it's the data that was generated in response to them. There are other use cases where form data can be cached but I hope this helps? FWIW, we are starting to move away from a standardized HTTP POST response pattern and implement pure AJAX based forms where the data exchanges are based on XMLHttpRequest interchanges. This separates the generation of the form from the data handling completely, however the basics of caching remain the same: if the pipeline that responds to the XMLHttpRequest decides that the output can be cached it generates a key that uniquely identifies the response. The same sub pipeline generates the same results for both a GET, POST and XMLHttpRequest under the covers, it doesn't care how the request originated... > > > I do not like unclear situations so I will answer your doubts. You have > said that I was suggesting that _validity objects_ can hang around while > I was suggesting that _cache keys_ will hang around the reuqest object. > That's the reason for my "not really". Is it clear now? Yes, makes sense, I was just reading too quickly... > > > > You'd have to automatically add wrapper transformers that worked on > > the sending and receiving pipelines (you can't require that they > > inherit from some common abstract classes). The good thing is that > > no new interfaces or components are required and the wrappers would be > > rather trivial. The bad things is that in some ways this is far more > > hacky, since as I said, it's essentially magic (it's completely > > hidden). > > > Yes, and I agree there would too much magic here. Also we would have no > control which components make use of the information stored in PIs. I > think it's much better to introduce this new interface and have full > knowledge which components implement it. I don't like any implementation that completely hides it's workings. However, having the key information passed about as metadata to the data stream potentially allows for _any_ SAX data stream to become part of the final results and still be cached. I can't give you a concrete use case, but I'm guessing that this could be used for SOAP and other foreign data stream encapsulation. Of course if you really want to have that option then that means some kind of standardized metadata and that's what standard HTTP headers are all about. So maybe the "proper" implementation here would be to use completely formed responses and parse the headers! That's real work and no longer trivial with no direct benefit for the moment. More-over, nothing you are doing would preclude such an implementation in the future; I could see some form of standardized SOAP like parser building a cache key for a foreign data stream that would then be coupled into the pipeline implementation that you are proposing if need be. Phew, a lot of discussion, but I think it's important; as Cocoon separates into discrete blocks we are essentially going to have to decide how decoupled the blocks are. Caching often seems to be an after thought in distributed systems (which is what we will be building) and it's important to understand the implications of the design decisions up front. If you had presented your current proposal when you originally asked the question I probably wouldn't have even responded, but continued to have some nagging thoughts about this issue that I never expressed. So forgive the rambling, but it helps me even if it doesn't help you... -- Peter Hunsberger