Return-Path: Delivered-To: apmail-httpd-dev-archive@www.apache.org Received: (qmail 41175 invoked from network); 4 Aug 2004 21:20:29 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 4 Aug 2004 21:20:29 -0000 Received: (qmail 86906 invoked by uid 500); 4 Aug 2004 21:20:23 -0000 Delivered-To: apmail-httpd-dev-archive@httpd.apache.org Received: (qmail 86823 invoked by uid 500); 4 Aug 2004 21:20:22 -0000 Mailing-List: contact dev-help@httpd.apache.org; run by ezmlm Precedence: bulk Reply-To: dev@httpd.apache.org list-help: list-unsubscribe: list-post: Delivered-To: mailing list dev@httpd.apache.org Received: (qmail 86810 invoked by uid 99); 4 Aug 2004 21:20:22 -0000 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_POST,HTML_10_20,HTML_MESSAGE,MIME_QP_LONG_LINE,NO_REAL_NAME,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received: from [64.12.138.204] (HELO imo-m14.mx.aol.com) (64.12.138.204) by apache.org (qpsmtpd/0.27.1) with ESMTP; Wed, 04 Aug 2004 14:20:19 -0700 Received: from TOKILEY@aol.com by imo-m14.mx.aol.com (mail_out_v37_r3.4.) id e.7a.5dcdc031 (4254); Wed, 4 Aug 2004 17:20:13 -0400 (EDT) From: TOKILEY@aol.com Message-ID: <7a.5dcdc031.2e42ad0d@aol.com> Date: Wed, 4 Aug 2004 17:20:13 EDT Subject: Re: [PATCH] mod_disk cached fixed To: dev@httpd.apache.org CC: TOKILEY@aol.com MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="part1_7a.5dcdc031.2e42ad0d_boundary" X-Mailer: 7.0 for Windows sub 10708 X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N --part1_7a.5dcdc031.2e42ad0d_boundary Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit > Brian Akins wrote... > > Serving cached content: > > - lookup uri in cache (via md5?). > - check varies - a list of headers to vary on > - caculate new key (md5) based on uri and clients value of these headers > - lookup new uri in cache > - continue as normal Don't forget that you can't just 'MD5' a header from one response and compare it to an 'MD5' value for same header field from another response. A "Vary:" check does not mean 'has to be exactly the same as the other one'. It just has to be 'semantically' different. You can have a header value that is formatted differently from another and it is still, essentially, the SAME as another and does NOT VARY. That includes different amounts of whitespace and a different 'ordering' of the 'values'. As long as the 'values' are the SAME with regards to another header, then the header fields do NOT VARY. The only way to do it right is to be able to parse each and every header (correctly) according o BNC and and compare them that way. Syntax or whitespace differences don't automatically mean a header 'Varies' at all. > The thing that sucks is if you vary on User-Agent. You wind up with a > ton of entries per uri. Yep. That's how 'Muli-Variants' works. There might be very good reasons why every 'Varying' User-Agent needs a different 'Variant' of the same response. > I cheated in another modules by "varying" on an > environmental variable. Kind of like this: > > BrowserMatch ".*MSIE [1-3]|MSIE [1-5].*Mac.*|^Mozilla/[1-4].*Nav" no-gzip > > and just "vary" on no-gzip (1 or 0), but this may be hard to do just > using headers... It's not hard to do at all... question would be whether it's ever the 'right' thing to do. The actual compressed content for different 'User-Agents' might actually 'Vary:' as well so no one single compressed version of a response should be used to satisfy all non-no-gzip requests if there is actually a 'Vary: User-Agent' rule involved. It's pretty hard to 'cheat' on 'Vary:' That's why it remains one of the least-supported features of HTTP. It's kind of an 'all or nothing' deal whereby if you can't do it ALL correctly... then might as well do what most products do and treat ANY 'Vary:' header as if it was 'Vary: *' ( Vary: STAR ) and don't even bother trying to cache it. Kevin Kiley --part1_7a.5dcdc031.2e42ad0d_boundary Content-Type: text/html; charset="US-ASCII" Content-Transfer-Encoding: quoted-printable
> Brian Akins wrote...
>
> Serving cached content:
>
> - lookup uri in cache (via md5?).
> - check varies - a list of headers to vary on
> - caculate new key (md5) based on uri and clients value of these header= s
> - lookup new uri in cache
> - continue as normal

Don't forget that you can't just 'MD5' a header from one response and
compare it to an 'MD5' value for same header field from another response.
A "Vary:" check does not mean 'has to be exactly the same as the other one'.=

It just has to be 'semantically' different.

You can have a header value that is formatted differently from another
and it is still, essentially, the SAME as another and does NOT VARY.

That includes different amounts of whitespace and a different
'ordering' of the 'values'. As long as the 'values' are the SAME
with regards to another header, then the header fields do
NOT VARY.

The only way to do it right is to be able to parse each and every
header (correctly) according o BNC and and compare them
that way. Syntax or whitespace differences don't automatically
mean a header 'Varies' at all.

> The thing that sucks is if you vary on User-Agent.  You wind up wi= th a
> ton of entries per uri. 

Yep. That's how 'Muli-Variants' works. There might be very good
reasons why every 'Varying' User-Agent needs a different 'Variant'
of the same response.

> I cheated in another modules by "varying" on an
> environmental variable.  Kind of like this:
>
> BrowserMatch ".*MSIE [1-3]|MSIE [1-5].*Mac.*|^Mozilla/[1-4].*Nav" no-gz= ip
>
> and just "vary" on no-gzip (1 or 0), but this may be hard to do just > using headers...

It's not hard to do at all... question would be whether it's ever
the 'right' thing to do.

The actual compressed content for different 'User-Agents' might
actually 'Vary:' as well so no one single compressed version of
a response should be used to satisfy all non-no-gzip requests
if there is actually a 'Vary: User-Agent' rule involved.

It's pretty hard to 'cheat' on 'Vary:'

That's why it remains one of the least-supported features of HTTP.

It's kind of an 'all or nothing' deal whereby if you can't do it ALL
correctly... then might as well do what most products do and
treat ANY 'Vary:' header as if it was 'Vary: *'  ( Vary: STAR )
and don't even bother trying to cache it.

Kevin Kiley


--part1_7a.5dcdc031.2e42ad0d_boundary--