httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dirk-Willem van Gulik <di...@webweaving.org>
Subject Re: cache - cleaning up mod_memcache and making other caches their live easier
Date Mon, 11 Feb 2008 00:22:27 GMT

On Feb 10, 2008, at 10:45 PM, Ruediger Pluem wrote:

> 1. Some style nitpicks (especially indenting, sometimes typos in  
> the comments) that
>    make the patch sometimes hard to read.

most files in cache seem out of sync with the .ident.pro file - so I  
guess I should do an
reformat run.

> 3. While it may be a worthwhile goal to improve the error handling  
> of mod_disk_cache
>    I would love to see this addressed in a separate patch as these  
> pieces of code distract
>    while reading the patch and are not strictly connected to your  
> goal of abstracting
>    logic out of mod_disk_cache.

Ok - will do separately - this patch was more thinking aloud than  
seriously - it has some holes still.

>> +/* Create a new table consisting of those elements from an input
>> + * headers table that are allowed to be stored in a cache.  
>> Optionally
>> + * filtered to just the fields passed in the vary_filter array.
>
> Why this? Do you want to save space by not storing the input  
> headers or
> only storing those relevant to varying?

No - what I'd like to do is 'minimize' what a storage needs to  
understand and do in order to be compliant. Specifically - I expect  
us to get more complex that just Vary at some point - and would  
really move this 'up' and out of the storage layer.

> From a first glance this looks like to be a good idea, but it may  
> be better
> separated to another patch to ease reading. This could possibly  
> used later
> to ease the logic in cache_select in cache_storage.c, as today its  
> vary logic
> is unneeded IMHO in the mod_disk_cache case. IMHO I cannot open the  
> mod_disk_cache

Agreed - that logic is un-needed for the disk cache.

But everyone does either need logic to filter theinput lines down to  
just those in 'Vary' --OR-- we need to build a two layer approach  
into the API down to the storage modules. E.g. by having open/create  
sometimes called twice - or by having mod_cache owning some of the  
key generation more strictly.

> entity if the cached resource had vary headers and the cached  
> headers do not
> match the ones from the incoming request. I would not have a hdrs  
> file in this
> case since the path to it is based on the hashed values of the  
> input headers
> that vary.
>>

>> +     *  4.20 Except [exerpt]
>> +     *     Comparison of expectation values is case-insensitive  
>> for unquoted
>> +     *     tokens (including the 100-continue token), and is case- 
>> sensitive for
>> +     *     quoted-string expectation-extensions.
>> +     *
>> +     *  XX we are currently concatenating the values. With some  
>> puzzling one could
>> +     *     set a header value which looks like the 'next/ 
>> previous' value - and hence
>> +     *     cause a value which is not the 'real' one. This may be  
>> a security risk.
>> +     *     Ideally we should use a special value which cannot  
>> occur in a header
>> +     *     legally (or is escaped/wacked). Given that (at least)  
>> a Header cannot
>> +     *     contain a space or ':' - may be an idea to insert those.
>> +     */
>
> I currently do not understand your worries here. Could you please  
> explain this
> in more detail?

Right now we simply concatenate values without any 'separator'. So by  
for example playing with the User-Agent - adding/prefixing another  
Vary value - you could perhaps fool us in thinking that another  
header was set - which was not set at all. I.e. with:

	Vary: Content-Language User-Agent

and a value on disk of

	EnMozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.4)  
Gecko/20070515 Firefox/2.0.0.4

then the question is did I pass

	GET / HTTP/1.0
	User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv: 
1.8.1.4) Gecko/20070515 Firefox/2.0.0.4
	Accept-Language; en
	Host : foo

or

	GET / HTTP/1.0
	User-Agent: EnMozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv: 
1.8.1.4) Gecko/20070515 Firefox/2.0.0.4
	Foo

or something along those lines. Not sure how bad this is -- but I've  
been bitten by things like this in the past. What I worry about is  
that a clever user can get something out of the cache we did not expect.

Or am I way off here ?

Dw.

Mime
View raw message