httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From toki...@aol.com
Subject Re: mod_cache, mod_deflate and Vary: User-Agent
Date Thu, 27 Aug 2009 07:50:43 GMT

> William A. Rowe, Jr.
>
> I think we blew it :)
>
> Vary: user-agent is not practical for correcting errant browser behavior.

You have not 'blown it'.

>From a certain perspective, it's the only reasonable thing to do.

Everyone keeps forgetting one very important aspect of this issue
and that is the fact that the 'Browsers' themselves are 
participating in the whole 'caching' scheme and that they
are the source of the actual requests, so their behavior is
as much a part of the equation as any inline proxy cache.

There is no real solution to this problem.

The HTTP protocol itself does not have the capability
to deal with things correctly with regards to 
compressed variants.

The only decision that anyone needs to make is 'Where is
the pain factor?'.

If you VARY on ANYTHING other than 'User-Agent' then this
might show some reduction of the pain factor at the proxy
level but you have now exponentially increased the pain
factor at the infamous 'Last Mile'.

Most modern browsers will NOT 'cache' anything that has
a 'Vary:' header OTHER than 'User-Agent:'. This is as true
today as it was 10 years ago.

The following discussion involving myself and some of the 
authors of the SQUID Proxy caching Server took place just 
short of SEVEN (7) YEARS ago but, as unbelievable as it might
seem, is still just as relevant ( and unresolved )...

http://marc.info/?l=apache-modgzip&m=103958533520502&w=2

It's way too long to reproduce here but here is just 
the SUMMARY part. You would have to access the link
above to read all the gory details...

[snip]

> Hello all.
>
> This is a continuation of the thread entitled...
>
> [Mod_gzip] "mod_gzip_send_vary=Yes" disables caching on IE
>
> After several hours spent doing my own testing with MSIE and
> digging into MSIE internals with a kernel debugger I think I
> have the answers.
>
> The news is NOT GOOD.
>
> I will start with a SUMMARY first for those who don't have the
> time to read the whole, ugly story but for those who want to
> know where the following 'conclusions' are coming from I
> refer you to the rest of the message and the "detail".
>
> SUMMARY
>
> There is only 1 request header value that you can use with
> "Vary:" that will cause MSIE to cache a non-compressed
> response and that is ( drum roll please ) "User-Agent".
>
> If you use ANY other (legal) request header field name in
> a "Vary:" header then MSIE ( Versions 4, 5 and 6 ) will
> REFUSE to cache that response in the MSIE local cache.
>
> This is why Jordan is seeing a caching problem and Slava
> is not. Slava is 'accidentally' using the only possible "Vary:"
> field name that will cause MSIE to behave as it should
> and cache a non-compressed response.
>
> Jordan is seeing non-compressed responses never being
> cached by MSIE because the responses are arriving
> with something other than "Vary: User-Agent" like
> "Vary: Accept-Encoding".
>
> It should be perfectly legal and fine to send "Vary: Accept-Encoding"
> on a non-compressed response that can 'Vary' on that field
> value and that response SHOULD be 'cached' by MSIE...
> but so much for assumptions. MSIE will NOT cache this response.
>
> MSIE will treat ANY field name other than "User-Agent"
> as if "Vary: *" ( Vary + STAR ) was used and it will
> NOT cache the non-compressed response.
>
> The reason the COMPRESSED responses are, in fact,
> always getting cached no matter what "Vary:" field name
> is present is just as I suspected... it is because MSIE
> decides it MUST cache responses that arrive with
> "Content-Encoding: gzip" because it MUST have a
> disk ( cache ) file to work with in order to do the
> decompression.
>
> The problem exists in ALL versions of MSIE but it's
> even WORSE for any version earlier than 5.0. MSIE 4.x
> will not even cache responses with "Vary: User-Agent".
>
> That's it for the SUMMARY.
>
> The rest of this message contains the gory details.

[/snip]

I participated in another lengthy 'offline' discussion about
all this some 3 or 4 years ago again with the authors of 
SQUID. There was still no real resolution to the problem.

The general consensus was that if there is always going to
be a 'pain factor' then it's better to follow one of the
rules of Networking and assume the following...

"The least amount of resources will always be present
the closer you get to the last mile."

In other words... it's BETTER to live with some redundant
traffic at the proxy level, where the equipment and bandwidth 
is usually more robust and closer to the backbone, than to put 
the pain factor onto the 'last mile' where resources are usually
more constrained.

If anyone is going to start dropping some special code
anywhere to 'invisibly handle the problem' my suggestion
would be to look at coming up with a scheme that undoes
the damage these out-of-control redundant 'User-Agent' strings are 
causing. The only thing a proxy cache really needs to know is
whether a certain 'User-Agent' string represents a 
different level of DEVCAP than another one. If all that
is changing is a version number and there is no change
with regards to actual Device Capabilities then there's
no reason to cache a separate response for that User Agent.

That still wouldn't represent the ultimate 'fix' for this
multi-variant caching issue... but it sure would be a
step in the right direction.

Yours...
Kevin Kiley

BTW: This posting doesn't even come anywhere near the
real issue which is that even Browsers that 'appear'
to not be able to support 'Accept-Encoding: gzip, deflate'
usually CAN... but it's actually all about MIME TYPES.
The HTTP protocol does NOT provide a way for a client to 
indicate WHICH mime types it can or cannot 'decompress'.
Browsers that appear 'broken' with regards to decompression
are actually only 'broken' for certain MIME types.

That's a complete separate discussion and I'm not
goint to 'go there' tonight.


-----Original Message-----
From: William A. Rowe, Jr. <wrowe@rowe-clan.net>
To: dev@httpd.apache.org <dev@httpd.apache.org>
Sent: Wed, Aug 26, 2009 1:47 pm
Subject: mod_cache, mod_deflate and Vary: User-Agent










I think we blew it :)

Vary: user-agent is not practical for correcting errant browser behavior.

For example;

  User-Agent: Mozilla/5.0 Gecko/20090729 Firefox/3.5.2

produces a myriad number of 'variant' flavors when tagging Vary with
the User-Agent when determining if the deflate/gzip compression should
be served, or the uncompressed variant.

What we really meant to do was to determine which Accept-Encoding values
were invalid based on known browser bugs, and -remove them- from the A-E
header *prior* to determining the cache handling (quick handler hook) or
typical content handling.

Which implies that setenvif + headers need an extra chance to run really
first in front of the quick handler.

Any better suggestions?







 


Mime
View raw message