httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neil Gunton <>
Subject Re: mod_proxy distinguish cookies?
Date Wed, 05 May 2004 23:24:26 GMT wrote:
> MOST Proxy Cache Servers ( including ones that SAY they are
> HTTP/1.1 compliant ) do NOT handle Vary: and they will simple
> treat ANY response they get with a "Vary:" header of any kind
> exactly the way MSIE seems to. They will treat it as if it was
> "Vary: *"  ( Vary: STAR ) and will REFUSE to cache it at all.

That's fine with me... I am mainly concerned with the browser and my
server. I know the browser will cache stuff when I want it to, and so
will my own reverse proxy. If intermediate caches choose not to then I
don't think it will have a huge effect on my server.

> I guess I really don't follow what the heck you are trying to do here.
> What do you mean by 'make every link on my site be of the form
> uri?xxxx'

Check out the site in question, for an
example of what I'm talking about. The code on this site may change in
the next couple of days, as I move over to the new way of doing things
(outlined in the previous email), but it does currently have the
"pics=xxx" on all URL's on the site. I achieve this by having global
Perl routines for writing all links in all the pages. This is done in
HTML::Embperl templates - every page on the site is a template. This is
the way that you can pass options around the site without using cookies.
The flaw is as I mentioned previously, if someone posts a link
somewhere, then that link will inevitably have the poster's options
embedded in the URL. So anyone who clicks on that link will get their
own options overwritten with the new link. This does work just fine
currently, has for a while now in fact.

> I guess I just don't get what it is you are trying to do that falls
> outside the boundaries of normal CGI and 'standard practice'.

What I do currently falls well within normal CGI conventions and
'standard practice', afaik. I have also tested this with the major
browsers (at least IE and Mozilla) and it works just fine, with the
browser caching requests correctly according to the Cache-Control and
Expires headers, and also distinguishing requests based on the URL.
Perhaps this is just by coincidence and isn't the way the standards are
"supposed" to work, but then again I think it's probable that things in
the HTTP world are so entrenched at this point that if they changed the
way all this works, it would just break too many sites. So it'll
probably stay like this for the foreseeable future, if previous
experience of inertia is anything to go by...

> AFAIK 'shopping carts' had this all figured out years ago.
> Now... if what you meant was that every time you send a PAGE
> down to someone with a particular cookie ( Real Cookie:, not
> URI PARMS one ) and you re-write all the clickable 'href' links
> in THAT DOCUMENT to have the 'other URI cookie' then yea....
> I guess that will work. That should force any 'clicks' on that
> page to come back to you so that YOU can decide where
> they go or if that Cookie needs to change.
> But that would mean rewriting every page on the way out the door.
> Surely there must be an easier way to do whatever it is you
> are trying to do.

Using template tool like HTML::Embperl, this is really not all that big
a deal. Every single page on my site is a template, some with HTML and
Perl code, some pure Perl modules. It may offend some purists, but I've
been developing this site for over three years now and it works well for

> Officially... the fact that you will be using QUERY PARMS at
> all times SHOULD take you out of the 'caching' ball game
> altogether since the mere presence of QUERY PARMS in
> a URI is SUPPOSED to make it ineligible for caching at
> any point in the delivery chain.

Is this true, or is it just something that the early proxies did because
of assumptions about CGI scripts being always dynamic and therefore not
cacheable? I think I read that somewhere (or maybe it was a comment
about URLs with 'cgi-bin'), and anyway as I said earlier, these requests
seem to be cached correctly by mod_proxy, mod_accel and the browsers, as
long as the correct Expires and Cache-Control headers are present. I
found that Last-Modified had to be present as well for mod_proxy to
cache, I seem to recall. But anyway, it does work.

> In other words... might as well use 'Cache-Control: no-cache'
> and just force everybody to come back all the time.

I don't think this is necessarily true, just from my own testing.

> Just because someone hits a 'Forward' or a 'Back' button on some
> GUI menu doesn't mean the HTTP freshness ( schemes ) always
> come into play. All you are asking the browser to do is jump
> between pages it has stored locally and that local cache is
> not actually required to be HTTP/1.1 compliant. Usually is NOT.
> Only the REFRESH button ( or CTRL-R ) can FORCE some browsers
> to 're-validate' a page. Simple local button navigations and
> re-displays
> from a local history list do not necessarily FORCE the browser to
> do anything at all 'out on the wire'.
> My own local Doppler Radar page is a good example.
> I can hit Forward or Backward until the cows come home and I
> will still see the same radar image. ONLY if I press CTRL-R
> or click Refresh ( manually ) do I have any chance of seeing
> a new radar screen.
> The browser is NOT doing a HEAD request or sending
> an 'If-Modified-Since' JUST because I pressed the 'Forward'
> or 'Back' button on the GUI menu.
> It all comes down to expiration dates and whether you
> have sent a 'pragma: auto-refresh' down with the original
> document.

Yes, I have had experience of setting the Expires and Cache-Control
headers for images as well as HTML documents, and I have found that it
is possible to force the browser to re-get the image when you click
'Back' on the browser. I had to do this early on in the development of
the crazyguyonabike website, because many of the pages were dynamic and
specific to whether the user was logged in or not (if they were logged
in then additional options were displayed). I think I had success in
requiring that the browser not cache images at all, when I needed that
> If you have rewritten every link on every page to use QUERY
> PARMS then I would assume this will 'force' a hit back to
> you if someone clicks on anything ( I don't know of any
> browser that would NOT obey this ) but this still doesn't
> cover what may/may not happen just because they hit
> the 'Forward' or 'Back' GUI menu button(s).

This is not my experience. Browsers cache URLs with query parms just
fine, if you have Expires and Cache-Control set correctly.
> Wouldn't a little JAVASCRIPT in the pages themselves help you out
> here?

I don't personally like JavaScript for behavior that is essential to the
correct working of my website. I use it for small convenience things,
such as automatically setting the keyboard focus on the login username
form field, but this is something that would work just as well without
the JavaScript. In other words, I use it for cases where if it works
then it adds something, but if it doesn't work then nothing breaks. It's
a personal choice, but as a developer and website user I prefer simple
HTML, not even any CSS on my site. People might think me simplistic, but
it's quite a conscious decision to go with  simple HTML code that works
on ANY browser. Many of my users have written to me saying how much they
like the minimalist philosophy, since it loads so quickly and works on
their small palmtop browsers (which some people have used to update
their journals from the road). But that's a philosophical discussion
that would quickly go far out of the bounds of this mailing list, so
I'll leave it there! Personal choice, in summary, I'm not saying
JavaScript is inherently right or wrong.

Thanks again for the interesting discussion!


View raw message