On Tue, Apr 13, 2010 at 1:20 PM, Brian Mearns <mearns.b@gmail.com> wrote:
On Tue, Apr 13, 2010 at 1:13 PM, Jonathan Zuckerman
<j.zuckerman@gmail.com> wrote:
>
>
> On Tue, Apr 13, 2010 at 12:13 PM, Brian Mearns <mearns.b@gmail.com> wrote:
>>
>> On Tue, Apr 13, 2010 at 10:49 AM, Jonathan Zuckerman
>> <j.zuckerman@gmail.com> wrote:
>> > On Tue, Apr 13, 2010 at 10:34 AM, Brian Mearns <bmearns@ieee.org> wrote:
>> >>
>> >> I'd like to use stronger and correlated ETag, namely the hash of the
>> >> content being served. Obviously it's a drag to do this in-line, so I'm
>> >> planning an automated task to generate the ETag values and store them
>> >> on the server. Is there any way I can get httpd to grab these stored
>> >> values for use in the Etag header? I'm flexible on how I store them:
>> >> in a database, in one large file, each in its own file named according
>> >> to the resource, etc.
>> >>
>> >> Any ideas?
>> >>
>> >> Thanks,
>> >> -Brian
>> >>
>> >> --
>> >> Feel free to contact me using PGP Encryption:
>> >> Key Id: 0x3AA70848
>> >> Available from: http://keys.gnupg.net
>> >>
>> >> ---------------------------------------------------------------------
>> >> The official User-To-User support forum of the Apache HTTP Server
>> >> Project.
>> >> See <URL:http://httpd.apache.org/userslist.html> for more info.
>> >> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>> >>   "   from the digest: users-digest-unsubscribe@httpd.apache.org
>> >> For additional commands, e-mail: users-help@httpd.apache.org
>> >>
>> >
>> > I have some "static" content that's actually built dynamically on the
>> > server
>> > (it's just a concatenated, minified JS or CSS file), and therefore can't
>> > use
>> > Apache's default etags/expires headers which I believe only apply to
>> > real
>> > files, so I do the same thing you're suggesting, in php.
>> > I would much rather let Apache take care of this for me, but my
>> > obsessive
>> > and orderly mind demands that I keep the Javascript and CSS that applies
>> > to
>> > different parts of the site in different files, and my background in
>> > high-load high-availability web-serving makes me want to keep the number
>> > of
>> > http requests down.
>> > So my question to you is, what is your reason for wanting to do this,
>> > and
>> > how would you implement if it did exist?  It's pretty trivial to do it
>> > with
>> > a scripting language that can alter response headers, if in fact it's
>> > really
>> > necessary..
>>
>> The reason is just to optimize caching. I guess the ETag doesn't
>> really need to be any stronger than the built-in, but I would like it
>> to be correlated, meaning if the content hasn't actually changed, or
>> has changed and then changed back, it will have the same ETag even
>> though the last-mod time is different.
>>
>> I'm not sure exactly what you mean by how I would implement it. In
>> terms of generating the ETag values? For true static content, I would
>> just hash the file. For PHP, for instance, I would filter it through
>> `php -w` first, and hash the result. Like I said, I'm not sure exactly
>> how I will store the generated values, it depends on how I'm actually
>> getting the values in the headers. I would use either a cron job or a
>> publishing-script to update the stored ETags.
>>
>> I have done this before in PHP, but I'd hate to have to serve static
>> content through a wrapper PHP script just to put an ETag header in
>> there.
>>
>> Thanks,
>> -Brian
>>
>>
>> --
>> Feel free to contact me using PGP Encryption:
>> Key Id: 0x3AA70848
>> Available from: http://keys.gnupg.net
>>
>> ---------------------------------------------------------------------
>> The official User-To-User support forum of the Apache HTTP Server Project.
>> See <URL:http://httpd.apache.org/userslist.html> for more info.
>> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>>   "   from the digest: users-digest-unsubscribe@httpd.apache.org
>> For additional commands, e-mail: users-help@httpd.apache.org
>>
>
> check out http://httpd.apache.org/docs/2.0/mod/core.html#fileetag
> If you can't get what you want with that, my personal opinion is that the
> performance gained by your request would not justify the amount of time
> required to develop it.

Thanks for the reference. The FileEtag directive is not as strong as
what I'm looking for. I understand your sentiment about it not being
worth the effort: but development effort is temporary, performance
improvements are forever =).

-Brian

--
Feel free to contact me using PGP Encryption:
Key Id: 0x3AA70848
Available from: http://keys.gnupg.net

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
  "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Of course performance is everything.

Every time a user requests a resource, this is what you want in your proposed scenario:
calculate hash of file based on _file contents_ -> compare it to the user's declared e-tag -> send either 200 or 304 response

this is what's currently happening:
calculate hash of file based on _file attributes_ -> compare it to user's e-tag/expires data -> send response

It seems to me that right off the bat you're losing performance because it will certainly take longer to pull the full contents of the file and hash that, rather than just using the attributes of the file to compute the e-tag.

The only scenario in which your method performs better than the standard Apache e-tag implementation is when a file is modified and later restored to its original state between the time the user accesses that resource a first and second time.  If I was an engineer in your group I'd ask to see some data to prove this happens often enough to make up for the initial performance loss.

Does that make sense, am I missing any details or anything?