httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ron <...@purplespots.com>
Subject [users@httpd] Suggestion: CDN using Etags
Date Mon, 12 Nov 2007 10:09:52 GMT
Hello,

I'm not sure if this is the right place to bring this up or not, but I
couldn't think of a better place to start.  If I'm writing to the wrong
place, please let me know.

I've been thinking about the problem of content distribution, especially
with linux distros, and I wonder if I might have hit upon a possible
interim solution.

It would seem to me that Etags are used to decide if some particular
content is "unique" when web browsers decide whether or not they have a
current copy, and I presume that caching proxies (squid) might make use
of the tag too.

Normally, the tag is comprised of (as I understand), the inode number,
and other data likely to ensure that no other content on the web will
have the same id.

Well, what if the Etag were something like:

Etag:
http://etag.somedistro.com/?n=httpd-2.2.6-3.i386.rpm&h=30092582700476e7c71768b0918f47b8

As long as every http mirror of the data uses the same etag, caches from
_any_ mirror would be treated equally.

To compare this, I have squid setup locally.  It caches a lot, but when
I download from random mirrors, since the data doesn't come from the
same place, the cache's consider each source to be different data, even
if the application (yum, apt) doesn't.

It would seem that it wouldn't take a lot of work on the server (apache)
to make some kind of pattern based etag system possible, if it isn't
already.

Then, the disto just needs to setup a server (i.e. etag.somedistro.com),
which doesn't serve data, but metadata about the data itself, and
possibly mirror info too.

Then yum/apt might need some tweaks.

But it seems to me that you'd end up with an instant solution to flash
crowds, helpfully cached by whatever cache sat between the person
downloading the data, and the world.

I'm quite certain that there are better solutions that can be thought
up, and I've seen a few, but this seems like it might be immediately
do-able, with relatively few tweaks.  It wouldn't solve every problem,
and many would remain, but I wonder, I just wonder if this might not
make a good dent.

I should note that I'm just a linux user.  I'm not a guru.  I've been
involved with linux/bsd for many years, but I'm definately not an E.
Raymond, L. Torvalds, nor an rms.  Not even close. :)

Comments welcome.

-Ron

Mime
View raw message