httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niklas Edmundsson <>
Subject Re: caching - serializing tables and what not
Date Fri, 25 Apr 2008 19:59:43 GMT
On Fri, 25 Apr 2008, Dirk-Willem van Gulik wrote:

> -	any religion on how we serialize tables throughout ?
> 	->	mod_disk_cache -- while there is some binary
> 		we are fairly careful to write most things
> 		out with 'key' ':' 'value' CR LF (and sort of
> 		hope the key never has a ':'.
> 	->	mod_memcached and lots of others do
> 			'key' \0 'value' \0
> 			...
> 			'\0'

My mod_disk_cache jumbopatch changes the on-disk headers to 
key\0value\0-style with great success, ie. it works fine with the 
lockless read-while-caching design I hacked up. The original (as in 
httpd proper) has lots of issues (mainly unnecessarily inefficient 
both to store and recall, no size-information so you can't easily 
decide on whether the on-disk file is incomplete/corrupted).

For reference, my patch is currently being tracked at - it's a bit 
out of date but since 2.2.8 is unusable due to the 
32bit-lfs-brigade-brokenness it can wait until 2.2.9. I haven't gotten 
around to create a real in-tree fork of mod_disk_cache to accomodate 
it yet, some if not all ideas in there should be usable for a wider 
audience than mostly-large-file archive sites.  Anyhow, feel free to 
peek at it if you want to see code backing my ramblings ;)

> 	And the latter make it a superset of array serialization. I am 
> tempted
> 	to go for the latter - and let it long term migrate into apr.
> -	it is useful to store things like timestamps, expiredates - we now
> 	do this 'raw' in most modules.
> 	->	wrap those in htonl/htons()
> 	->	serialize them integers to ascii.

This seems unneccesary to me. The on-disk info is only meant to be 
read by that specific machine IMHO. It should be tuned to be easy to 
load/store, even though it's tempting to waste cycles on making it 
universally portable I suspect it's a bad idea in the long run.

The only thing that I think can be of interest is to make it handle 
32/64bit transitions gracefully. I've achieved this in my 
mod_disk_cache jumbopatch by using the APR 32/64bit types explicitly 
instead of types that might vary in size.

> 	or is that generally felt as over the top ? I am tempted to do 
> so - as the
> 	cost is very low - and it does help with distributed cased. 
> But those who
> 	care the most about that are also the most likely to be 
> careful about using
> 	the same endian/operating system throughout.

I'm not convinced this is useful, but then I'm rather biased due to 
our usecase of mod_disk_cache ;)

/Nikke - at least managed to fill the 4 gigabits available to the
          computer club during the Ubuntu release using
          donated five-year-old leftover servers ;)
  Niklas Edmundsson, Admin @ {acc,hpc2n}      |
  "Read my lips and come to grips with reality." --Jafar

View raw message