httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From TOKI...@aol.com
Subject Re: mod_proxy distinguish cookies?
Date Wed, 05 May 2004 18:48:01 GMT

> Neil wrote...
>
> Thanks again Kevin for the insight and interesting links. It seems to me
> that there are basically three components here: My server, intermediate
> caching proxies, and the end-user browser. From my understanding of the
> discussion so far, each of these can be covered as follows:
>
> 1. My server: Cookies can be understood (i.e. queries are
> differentiated) by my server's reverse proxy cache.

Sure... but only if you are receiving all the requests WHEN
and AS OFTEN as you need to. ( User-Agents coming back
for pages when they are supposed to )...

> 2. Intermediate caching proxies: I can use the 'Vary: Cookie' header to
> tell any intermediate caches that cookies differentiate requests.

Nope. Scratch the word 'any' and substitute 'some'.

There are very few 'Intermediate caching proxies' that are able to
'do the right thing' when it comes to 'Vary:'.

MOST Proxy Cache Servers ( including ones that SAY they are
HTTP/1.1 compliant ) do NOT handle Vary: and they will simple
treat ANY response they get with a "Vary:" header of any kind
exactly the way MSIE seems to. They will treat it as if it was
"Vary: *"  ( Vary: STAR ) and will REFUSE to cache it at all.

Might as well just use 'Cache-Control: no-cache'. It will be the
same behavior for caches that don't support "Vary:".

SQUID is the ONLY caching proxy I know of that even comes
close to handling "Vary:" correctly but only the latest version(s).

For years now... even SQUID would just 'punt' any response
that had any kind of "Vary:" header at all. It would default
all "Vary: xxxxxx" headers to "Vary: *"  ( Vary: STAR ) and
never bother to cache them at all.

Even the latest version
of SQUID is still not HTTP/1.1 compliant. There is still a lot
of 'Etag:' things that don't get handled correctly.

It's possible to implement "Vary:" without doing full "Etag:"
support as well but there will always be times when the 
response is not cacheable unless full "Etag:" support
is onboard.

So you CAN/SHOULD use the "Vary: Cookie" response
header and it WILL work for SOME inline caches... but
be fully prepared for users to report problems when the
inline cache is paying no attention to your "Vary:".

> 3. Browsers: Pass the option cookie around as part of the URL param list
> (relatively easy to do using HTML::Embperl or other template solution).
> So if the cookie is "opts=123", then I make every link on my site be of
> the form "/somedir/example.html?opts=123&...". This makes the page look
> different to the browser when the cookie is changed, so the browser will
> have to get the new version of the page. 

Not sure. Maybe.

I guess I really don't follow what the heck you are trying to do here.

What do you mean by 'make every link on my site be of the form uri?xxxx'

Don't you mean you want everyone USING your site to be sending
these varius 'cookie' deals so you can tell who is who and something
just steps in and makes sure they get the right response?

You should not have to 'make every link on my site' be anything.
Something else should be sorting all the requests out.

I guess I just don't get what it is you are trying to do that falls
outside the boundaries of normal CGI and 'standard practice'.

AFAIK 'shopping carts' had this all figured out years ago.

Now... if what you meant was that every time you send a PAGE
down to someone with a particular cookie ( Real Cookie:, not
URI PARMS one ) and you re-write all the clickable 'href' links
in THAT DOCUMENT to have the 'other URI cookie' then yea....
I guess that will work. That should force any 'clicks' on that
page to come back to you so that YOU can decide where
they go or if that Cookie needs to change.

But that would mean rewriting every page on the way out the door.

Surely there must be an easier way to do whatever it is you
are trying to do.

Officially... the fact that you will be using QUERY PARMS at
all times SHOULD take you out of the 'caching' ball game
altogether since the mere presence of QUERY PARMS in
a URI is SUPPOSED to make it ineligible for caching at
any point in the delivery chain.

In other words... might as well use 'Cache-Control: no-cache'
and just force everybody to come back all the time.

> ...This makes the the page look
> different to the browser when the cookie is changed, so the browser will
> have to get the new version of the page. 

Again.. I am not sure I would say 'have to'.

There is no 'have to' when it comes to what a User-Agent may or
may not be doing with cached files. Most of them follow the rules
but many do not.

I think you might be a little confused about what is actually going
on down at the browser level.

Just because someone hits a 'Forward' or a 'Back' button on some
GUI menu doesn't mean the HTTP freshness ( schemes ) always
come into play. All you are asking the browser to do is jump 
between pages it has stored locally and that local cache is
not actually required to be HTTP/1.1 compliant. Usually is NOT.

Only the REFRESH button ( or CTRL-R ) can FORCE some browsers
to 're-validate' a page. Simple local button navigations and re-displays
from a local history list do not necessarily FORCE the browser to
do anything at all 'out on the wire'.

My own local Doppler Radar page is a good example.

I can hit Forward or Backward until the cows come home and I
will still see the same radar image. ONLY if I press CTRL-R 
or click Refresh ( manually ) do I have any chance of seeing
a new radar screen.

The browser is NOT doing a HEAD request or sending 
an 'If-Modified-Since' JUST because I pressed the 'Forward'
or 'Back' button on the GUI menu.

It all comes down to expiration dates and whether you
have sent a 'pragma: auto-refresh' down with the original
document.

If you have rewritten every link on every page to use QUERY
PARMS then I would assume this will 'force' a hit back to
you if someone clicks on anything ( I don't know of any
browser that would NOT obey this ) but this still doesn't
cover what may/may not happen just because they hit
the 'Forward' or 'Back' GUI menu button(s).

> I don't actually use the URL
> param on the backend, only the cookie version of the value is used. The
> URL param is simply there to make the URL look different to the browser.
> Thus if someone posts a link to my website with opt=123 in the query
> string, and then someone with cookie opt=456 clicks on that link, they
> should successfully get the right version of the page.

I hear ya... but again... I think you would be just as well off 
using "Cache-Control: no-cache" and/or "Expires: -1" for all
the good the 'follow the bouncing URI parms' scheme is going
to do for you. The reality of either approach is that your 
pages just aren't going to be cached very much, if at all.

> I think all this allows me to have pages be cached, 
> while also allowing
> cookies to be used to store options. This does assume that any "real"
> proxy caches in the middle obey the "Vary: Cookie" header. If they get a
> request for a page in their cache from a browser with a different cookie
> to that associated with the cache entry, then presumably the cache is
> required to not use the cache entry and pass it through to the origin
> server.

Presumption correct... but only if that puppy is even trying to do "Vary:".

If it's not SQUID ( latest version only ) good luck to you.

> This obviously isn't ideal, but it attempts to address the world as it
> seems to be today.

Should work. Your URI parms 'trick' should take care of (most)
refresh brain-death at the browser level and adding "Vary: Cookie"
will at least allow SOME proxies to cache things so your COS
Server doesn't get beat to death.

> If anyone sees any potential problems with this sort of setup, then let
> me know...
>
> Thanks again, this has been a very enlightening discussion.
>
> -Neil

Wouldn't a little JAVASCRIPT in the pages themselves help you out here?

The pages themselves can 'know' if 'Forward' or 'Back' buttons are
getting pressed and all that do-dah and could be making client-level
decisions for you regarding link jumps and 'what page to get'.

Yours...
Kevin




Original message...

>>TOKILEY@aol.com wrote:
>>
>> Bottom line:
>> 
>> In order to do your 'Cookie' scheme and have it work with
>> all major browsers you might have to give up on the idea
>> that the responses can EVER be 'cached' locally by
>> a browser... but now you also lose the ability to have
>> it cached by ANYONE.
>> 
>> There is no HTTP caching control directive that says...
>> 
>> Cache-Control: no-cache-only-if-endpoint-user-agent
>> 
>> Given the caching issues in most 'end-point' browsers...
>> There probably should be such a directive.
>> 
>> The ONLY guy you don't want to cache it is the
>> end-point browser itself... but you DO want the
>> response available from other nearby caches so
>> your Content Origin Server doesn't get hammered
>> to death.
>
> Neil wrote...
>
> Thanks again Kevin for the insight and interesting links. It seems to me
> that there are basically three components here: My server, intermediate
> caching proxies, and the end-user browser. From my understanding of the
> discussion so far, each of these can be covered as follows:
>
> 1. My server: Cookies can be understood (i.e. queries are
> differentiated) by my server's reverse proxy cache.
>
> 2. Intermediate caching proxies: I can use the 'Vary: Cookie' header to
> tell any intermediate caches that cookies differentiate requests.
>
> 3. Browsers: Pass the option cookie around as part of the URL param list
> (relatively easy to do using HTML::Embperl or other template solution).
> So if the cookie is "opts=123", then I make every link on my site be of
> the form "/somedir/example.html?opts=123&...". This makes the page look
> different to the browser when the cookie is changed, so the browser will
> have to get the new version of the page. I don't actually use the URL
> param on the backend, only the cookie version of the value is used. The
> URL param is simply there to make the URL look different to the browser.
> Thus if someone posts a link to my website with opt=123 in the query
> string, and then someone with cookie opt=456 clicks on that link, they
> should successfully get the right version of the page.
>
> I think all this allows me to have pages be cached, while also allowing
> cookies to be used to store options. This does assume that any "real"
> proxy caches in the middle obey the "Vary: Cookie" header. If they get a
> request for a page in their cache from a browser with a different cookie
> to that associated with the cache entry, then presumably the cache is
> required to not use the cache entry and pass it through to the origin
> server.
>
> This obviously isn't ideal, but it attempts to address the world as it
> seems to be today.
>
> If anyone sees any potential problems with this sort of setup, then let
> me know...
>
> Thanks again, this has been a very enlightening discussion.
>
> -Neil

Mime
View raw message