cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Unico Hommes" <Un...@hippo.nl>
Subject RE: Improving HTTP protocol handling (Was: RE: Fooling around with cocoon davmap)
Date Tue, 04 Nov 2003 14:01:08 GMT
 

> 
> -----Original Message-----
> From: Sylvain Wallez [mailto:sylvain@apache.org] 
> Sent: maandag 3 november 2003 12:45
> To: dev@cocoon.apache.org
> 
> Unico Hommes wrote:
> 
> <snip/>
> 
> >>IMO, this should be handled at the pipeline level, i.e. on 
> a HEAD request, the pipeline should be built and setup, but 
> not executed. And this for several reasons:
> >>- not every request is handled by flowscript
> >>- some pipeline components set response headers, such as 
> the i18n transformer or the browser selector.
> >>- if we use the pipeline key as the Etag (see below), the 
> pipeline must be built and setup to compute that key.
> >>    
> >>
> >
> >Good point, we need to do that too, but not having to send a 
> page from the flow could also help us in other situations 
> where we don't need access to the pipeline. Think OPTIONS, 
> TRACE, MKCOL, PUT, etc. Or do you think these should also be 
> handled at the pipeline level?
> >  
> >
> 
> HEAD is a bit special here since it can be considered as a 
> "stripped-down" version of GET and as such doesn't require 
> special application-level handling.
> 
> Other methods need to trigger some application-specific 
> behaviour that must handled somehow. But I see your point: 
> some methods don't ask for a response body. We currently have 
> no way to express this, as the sitemap engine throws a RNFE 
> (and hence a 404) if no pipeline was built.
> 
> To express this body-less response, several solutions come to mind:
> - have a "null-reader" that allows building a pipeline that 
> sends nothing
> - have some new method on environment stating that no body is 
> to be produced. But this require a new sitemap statement.
> - redirect to a special protocol ("null-body:"?) that 
> indicates a body-less answer.
> 
> The two first solutions have the drawback of requiring some 
> matching in the pipeline just to say that we don't want to 
> generate a response body. 
> This is useless (and CPU consuming) if the request handling 
> is done in a flowscript.
> 
> The third solution (redirect) has the advantage of not adding 
> a new sitemap statement and be available at no extra cost 
> from a flowscript (or an action). But it sounds a bit hacky.
> 
> What do you think?
> 

Why not just allow a flow function not to redirect? Wouldn't that be
easiest?

> >>Note that this pipeline-level handling is different from 
> fooling the serializer by sending its output to /dev/null, 
> since the processing chain is setup to get all required 
> information, but not executed.
> >>
> >>Actually, this is not very different from what happens 
> today when content is retrieved from the cache (pipeline is 
> built and setup but not executed).
> >>    
> >>
> >
> >OK. Are you saying then that the pipelines should be 
> handling more low level HTTP methods? Or do you see some 
> other specialized component handling this?
> >  
> >
> 
> Maybe just HEAD (see above).
> 
...
> 
> >>but it looks like the pipeline cache key could be used for 
> the ETag. 
> >>What do you think?
> >>
> >
> >I think so. The spec talks about weak and strong entity 
> tags. I would say the pipeline cache key qualifies as a weak 
> one. Weak keys only approximate semantic equivalence whereas 
> strong keys reflect the verbatim response.
> >
> 
> So strong keys can be e.g. the MD5 signature of the response body?
> 

... of the complete response IIUC. But I'm still trying to grasp this
myself so I am not really the one to ask.

> >Because although the pipeline output may stay the same it 
> doesn't include information about the values of the response 
> headers, and because validity object the pipeline gets from 
> the pipeline components doesn't state the content wouldn't be 
> different if it would execute the pipeline again, just that 
> it shouldn't execute the pipeline.
> >
> >
> 
> Mmmh... If this isn't true, then we have a serious problem, 
> because the pipeline is not executed if the validity is 
> valid. Or did I missed something?
> 

The thing I was getting at is that the components have full control over
what is cached and what isn't. Consider a component that knows it may be
somewhat heavy and therefore employs a delta time strategy that states
to the cache it will be valid until some time in the future. Now the
actual underlying datasource it uses could have changed but the cached
version would still be valid according to the pipeline component. The
execution of the pipeline however would still result in a different
response body if it were tried, which it is not of course. 

I do wonder if this shows a potential point of improvement. Wouldn't it
be better if the contract between the caching pipeline and a cacheable
pipeline component would be for the latter to return its "true" validity
(or tell the pipeline it doesn't know). In that way, delta time
strategies and also event cache strategies would be configured at the
level of the pipeline instead of the individual components. This would
also get rid of all the multitude of different ways different components
can be configured to control caching. This also gives rise to a finer
granularity in the definition of the pipeline section. But perhaps we
should leave this discussion for another occasion.

How this all relates to Etags I don't know.

> Also, the rule for pipeline components should be that 
> entity-header related headers (e.g. Vary of browser selector) 
> should be set at pipeline setup time while entity-body 
> related headers (e.g. 
> content-length) should be set at pipeline execution time.
> 

So this means that the cached version of a response is not guaranteed to
have all of the same headers set as the original pipeline execution?
Because as far as I remember, the cache only stores the response body,
not any headers.

-- Unico

> Sylvain
> 
> -- 
> Sylvain Wallez                                  Anyware Technologies
> http://www.apache.org/~sylvain           http://www.anyware-tech.com
> { XML, Java, Cocoon, OpenSource }*{ Training, Consulting, 
> Projects } Orixo, the opensource XML business alliance  -  
> http://www.orixo.com
> 
> 
> 
> 

Mime
View raw message