httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Darroch <>
Subject thoughts on ETags and mod_dav
Date Thu, 11 Oct 2007 21:50:23 GMT
Hi --

   A couple of months ago a short thread started in relation to the
PRs #16593 and #38034 (which also references #42987) on the various
problems related to ETags:

   I'm not about to tackle these all immediately, but I wanted to
send out a couple of proposals and find out what problems folks
might see with them.

1) Per #38034, it appears that ap_meets_conditions() treats "*" incorrectly.
The patch attached to the PR duplicates ap_meets_conditions() for
exclusive use by mod_dav.  The only changes are to add checks on
resource->exists in two places and to use resource->hooks->getetag instead
of looking for an ETag header in r->headers_out.

   In the interest of avoiding code duplication, I wonder if it would
be possible to continue to use ap_meets_conditions() by using
r->no_local_copy in place of resource->exists, and by setting/unsetting
that and an ETag header in r->headers_out in mod_dav as necessary;
e.g., prior to calls to ap_meets_conditions().

   However, it looks to me as though r->no_local_copy is used in a few
places to bypass the ap_meets_conditions() checks entirely (i.e., to
never send a 304 response in certain cases).

   Does anyone have comments on what the intended purpose
r->no_local_copy is or was, especially in relation to the language
of RFC 2616 sections 14.24 and 14.26 ("if '*' is given and any current
entity exists for that resource ...")?

   Would it be more appropriate and/or required by our backporting
rules to add another field to request_rec, like r->exists, to indicate
to ap_meets_conditions() that a resource does not exist?  Or to use
r->notes or conceivably r->finfo for the same purpose?

   Is it even possible to backport a change to the action of
ap_meets_conditions() such that it would start depending on data
which third-party callers won't be supplying?  Advice is welcome, please!

2) I haven't looked at #16593 in any detail, but I wonder if the
change proposed in the patch attached to #38034 where
resource->hooks->getetag is used instead of looking for an ETag header
in r->headers_out might not be an attempt to deal with this issue.
If so, then whatever the eventual solution for #38034 turns out to
be will likely resolve this PR as well.

3) Per #42987, it seems as if there's a subtle difference between
the requirements for ETags as described in RFC 2616 section 13.3.3
and the way we currently generate them in ap_make_etag().

   When a file's mtime is in the past, ap_make_etag() uses the file's
inode, length, and/or mtime to create a strong ETag.  However, when the
mtime is the same as r->request_time, a weak ETag is generated.  As the
attachment to the PR reveals, though, this isn't really sufficient when
certain sequences of requests occur within a single second.

   The RFC says that a file's mtime "could" be a weak validator, because
the file might change more than once in a second.  But I think the
subtly important issue here is that word "could".  If I understand it
correctly, the implication is that one has resource for which one wants
to generate weak ETags only, because one knows the semantic meaning isn't
changing although the content may be.  In this particular case, the
modification time *might* be appropriate to use as a weak validator
(although that seems somewhat unlikely, because non-semantic changes
that occurred seconds apart would still cause different weak ETags to be
generated, which probably isn't what you'd want in this circumstance).

   But, in the cases handled by ap_make_etag(), the content and/or the
semantic meaning of a resource could always be changing, so weak ETags
would seem to be, as the PR says, simply never appropriate.

   Alas, though, there is probably no way to quickly generate a strong
validators like a hash -- which is why we're using inodes, lengths, and
mtimes in the first place, of course.

   So, a modest proposal, which aims to punt the problem to the
administrator, and also to resolve the fact that FileETag is provided
to configure the actions of ap_make_etag(), while mod_dav_fs's
dav_fs_getetag() actions are fixed.  The highlights would be:

- refactor ap_make_etag() and dav_fs_getetag() to use a common
  code base as far as possible

- make dav_fs_getetag() follow the options set by FileETag (if this
  is a problem, then introduce a DavFileETag directive, but I'm hoping
  that could be avoided)

- add MD5 and SHA1 as options to FileETag (while obviously much
  slower, certain applications may require "really strong" validators);
  for legacy reasons, these would have to be specifically configured
  and would not be implied by the "All" option

- add a nomtime=none|weak|ignore option to FileETag (in the manner
  of ProxyPass) which specifies what to do when the MTime option is
  configured but the file's mtime is the same as r->request_time;
  the default value is "weak" to retain backward compatibility

- allow mod_dav_fs to cache ETags in its state directory, either
  in its properties DBM files or elsewhere, at least when configured to
  generate "really strong" validators; e.g., lookup in cache whether
  file's inode, length, and mtime match cached values; if OK and mtime is
  not current time then return cached hash value; otherwise
  recalculate hash value and store in cache along with inode, length,
  and mtime for future comparisons

- change ETag generation logic to respect the new hash options (MD5, SHA1);
  possibly add support for caching, at least in the mod_dav case where
  the refactored function is called from dav_fs_getetag()

- change ETag generation logic (from current ap_make_etag() implementation)
  such that when the MTime option is not specified, no check against
  r->request_time is done; currently this is always done and the ETag may
  be made weak even if MTime is not configured

- change ETag generation logic to respect the new nomtime=none|weak|ignore
  option when MTime is configured; in this case, when the file's mtime
  equals r->request_time, either generate no ETag (none), generate a
  weak ETag in the current manner (weak), or generate a strong ETag (ignore)

   Whew.  Thoughts and criticisms?  Obviously none of this will get done
in the immediate future, but any discussion will, I hope, give me
and others some direction on these issues.  Thanks,


GPG Key ID: 366A375B
GPG Key Fingerprint: 485E 5041 17E1 E2BB C263  E4DE C8E3 FA36 366A 375B

View raw message