httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vacelet, Manuel" <manuel.vace...@enalean.com>
Subject Re: [users@httpd] Last-Modified header overridden
Date Wed, 29 Jun 2016 07:32:57 GMT
On Tue, Jun 28, 2016 at 6:32 PM, Luca Toscano <toscano.luca@gmail.com>
wrote:

>
>
> 2016-06-27 14:52 GMT+02:00 Vacelet, Manuel <manuel.vacelet@enalean.com>:
>
>>
>>
>> On Mon, Jun 27, 2016 at 2:17 PM, Luca Toscano <toscano.luca@gmail.com>
>> wrote:
>>
>>>
>>>
>>> 2016-06-27 13:17 GMT+02:00 Vacelet, Manuel <manuel.vacelet@enalean.com>:
>>>
>>>>
>>>>
>>>> On Sat, Jun 25, 2016 at 11:00 AM, Luca Toscano <toscano.luca@gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> 2016-06-24 17:26 GMT+02:00 Vacelet, Manuel <manuel.vacelet@enalean.com
>>>>> >:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Sun, Jun 19, 2016 at 3:17 PM, Luca Toscano <toscano.luca@gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2016-06-08 16:14 GMT+02:00 Vacelet, Manuel <
>>>>>>> manuel.vacelet@enalean.com>:
>>>>>>>
>>>>>>>> On Tue, Jun 7, 2016 at 11:02 PM, Luca Toscano <
>>>>>>>> toscano.luca@gmail.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2016-06-07 10:55 GMT+02:00 Vacelet, Manuel <
>>>>>>>>> manuel.vacelet@enalean.com>:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Jun 6, 2016 at 5:32 PM, Vacelet, Manuel <
>>>>>>>>>> manuel.vacelet@enalean.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> dOn Mon, Jun 6, 2016 at 5:00 PM, Vacelet, Manuel
<
>>>>>>>>>>> manuel.vacelet@enalean.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Jun 6, 2016 at 4:09 PM, Luca Toscano
<
>>>>>>>>>>>> toscano.luca@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I was able to repro building httpd from
2.4.x branch and
>>>>>>>>>>>>> following your configuration files on
github. I am almost sure that
>>>>>>>>>>>>> somewhere httpd sets the Last-Modified
header translating "foo" to the
>>>>>>>>>>>>> first Jan 1970 date. I realized though
that I didn't recall the real issue,
>>>>>>>>>>>>> since passing value not following the
RFC can lead to inconsistencies, so I
>>>>>>>>>>>>> went back and checked the correspondence.
Quoting:
>>>>>>>>>>>>>
>>>>>>>>>>>>> "Actually I wrote this snippet to highlight
the behaviour (the
>>>>>>>>>>>>> original code sent the date in iso8601
instead of rfc1123) because it was
>>>>>>>>>>>>> more obvious.
>>>>>>>>>>>>> During my tests (this is extracted from
an automated test
>>>>>>>>>>>>> suite), even after having converted dates
to rfc1123, I continued to get
>>>>>>>>>>>>> some sparse errors. What I got is that
the value I sent was sometimes
>>>>>>>>>>>>> slightly modified (a second or 2) depending
on the machine load."
>>>>>>>>>>>>>
>>>>>>>>>>>>> So my understanding is that you would
like to know why a
>>>>>>>>>>>>> Last-Modified header with a legitimate
date/time set by a PHP app gets
>>>>>>>>>>>>> "delayed" by a couple of seconds from
httpd, right?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Yes for sure, this is the primary issue.
>>>>>>>>>>>> However, the (undocumented) difference of
behavior from one
>>>>>>>>>>>> version to another (2.2 -> 2.4 and more
surprisingly from between two 2.4
>>>>>>>>>>>> versions) is also in question here.
>>>>>>>>>>>> Even more strange, 2.4 built for other distrib
doesn't
>>>>>>>>>>>> highlight the behaviour !
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I made another series of test and it seems to
be linked to
>>>>>>>>>>> fastcgi.
>>>>>>>>>>>
>>>>>>>>>>> I took the stock apache (2.4.6 plus tons of patches)
 & php-fpm
>>>>>>>>>>> (5.4.16 + tons of patches) from RHEL7 and I get
the exact same behaviour
>>>>>>>>>>> (headers rewritten to EPOCH)
>>>>>>>>>>> However, if I server the very same php script
from mod_php
>>>>>>>>>>> (instead of fcgi) it "works" (the headers are
not modified).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> For the record, I also have the same behaviour (headers
rewritten
>>>>>>>>>> when using php-fpm + fastcgi) on alpine linux 3.4
that ships apache2-2.4.20.
>>>>>>>>>> So AFAICT, it doesn't seem distro specific.
>>>>>>>>>>
>>>>>>>>>> On the root of the problem, from my point of view:
>>>>>>>>>> - the difference between mod_php vs. php-fpm + fcgi
is
>>>>>>>>>> understandable (even if not desired and not documented).
>>>>>>>>>> - the fact that fcgi handler parse & rewrite
headers seems to
>>>>>>>>>> lead to inconsistencies (I'll try to build a test
case for that).
>>>>>>>>>> - however, even if the headers are wrong, I think
apache default
>>>>>>>>>> (use EPOCH) is wrong as it leads to very inconsistent
behaviour (the
>>>>>>>>>> resource will never expire). I would prefer either:
>>>>>>>>>> -- do not touch the header
>>>>>>>>>> -- raise a warning and discard the header
>>>>>>>>>>
>>>>>>>>>> What do you think ?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> From my tests the following snippet of code should be
responsible
>>>>>>>>> for the switch from 'foo' to unix epoch:
>>>>>>>>>
>>>>>>>>> *https://github.com/apache/httpd/blob/2.4.x/server/util_script.c#L663
>>>>>>>>> <https://github.com/apache/httpd/blob/2.4.x/server/util_script.c#L663>*
>>>>>>>>>
>>>>>>>>> The function that contains the code,
>>>>>>>>> ap_scan_script_header_err_core_ex, is wrapped by a lot
of other functions
>>>>>>>>> eventually called by modules like mod-proxy-fcgi. A more
verbose
>>>>>>>>> description of the function in:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> https://github.com/apache/httpd/blob/2.4.x/include/util_script.h#L200
>>>>>>>>>
>>>>>>>>> Not sure what would be the best thing to do, but probably
we could
>>>>>>>>> follow up in a official apache bugzilla task?
>>>>>>>>> https://bz.apache.org/bugzilla/enter_bug.cgi?product=Apache%20httpd-2
>>>>>>>>>
>>>>>>>>>
>>>>>>>> Wow, thanks for the investigation !
>>>>>>>>
>>>>>>>
>>>>>>> Sorry for the delay! I submitted a patch for trunk with a possible
>>>>>>> fix, namely dropping (and logging at trace1 level) any non compliant
>>>>>>> date/time set in a Last-Modified header returned by a FCGI/CGI
script:
>>>>>>> http://svn.apache.org/r1748379
>>>>>>>
>>>>>>>
>>>>>> Cool :)
>>>>>>
>>>>>>
>>>>>>> The fix is also in the list of proposal for backport to the 2.4.x
>>>>>>> branch, we'll see what other people think about this solution.
>>>>>>>
>>>>>>> We should also do a follow up for the other main issue, namely
the
>>>>>>> fact that you see a different/delayed Last-Modified header sometimes
among
>>>>>>> your FCGI/httpd responses. Can you give me an example of Last-Modified
>>>>>>> header value before/after the "delay" and a way to repro it?
>>>>>>>
>>>>>>
>>>>>> I wrote a test case in the "time" branch:
>>>>>> https://github.com/vaceletm/bug-httpd24/tree/time
>>>>>>
>>>>>> In my own tests, I get:
>>>>>> --------------------->8---------------------
>>>>>> < Date: Fri, 24 Jun 2016 15:21:46 GMT
>>>>>> < Server: Apache/2.4.18 (Red Hat)
>>>>>> < X-Powered-By: PHP/5.6.5
>>>>>> < Last-Modified: Fri, 24 Jun 2016 15:21:48 GMT
>>>>>> < Transfer-Encoding: chunked
>>>>>> < Content-Type: text/html; charset=UTF-8
>>>>>> <
>>>>>> { [data not shown]
>>>>>>   0    44    0    44    0     0     21      0 --:--:--  0:00:02
>>>>>> --:--:--    21* Connection #0 to host localhost left intact
>>>>>>
>>>>>> * Closing connection #0
>>>>>> sent value: Fri, 24 Jun 2016 17:21:46 +0200
>>>>>> --------------------->8---------------------
>>>>>>
>>>>>> The value sent doesn't respect RFC1123 (+0200 instead of GMT as time
>>>>>> zone) but the result is weird as you can see:
>>>>>> - I sent "Fri, 24 Jun 2016 17:21:46 +0200"
>>>>>> - but apache decided to send "Fri, 24 Jun 2016 15:21:48 GMT"
>>>>>>
>>>>>> Notice the 2 seconds ?
>>>>>> I put a "sleep(2)" in my php script...
>>>>>>
>>>>>> I don't know if your fix also take this into account
>>>>>>
>>>>>
>>>>> Thanks a lot for the precise test! The same code snippet that I
>>>>> modified is responsible for the behavior that you mentioned. Httpd modifies
>>>>> the Last-Modified header with the request's modification time if the
value
>>>>> sent from FCGI appears to be in the future (since the HTTP RFC states
"An
>>>>> origin server with a clock MUST NOT send a Last-Modified date that is
later
>>>>> than the server's time of message origination (Date).").
>>>>>
>>>>> I modified your PHP code snippet (http://apaste.info/EEz) trying to
>>>>> compare a GMT date vs a "Europe/Paris" one, already formatted for RFC1123,
>>>>> and PHP seems to agree with httpd in recognizing the "Europe/Paris" date
as
>>>>> more recent. Moreover, if you generate a GMT date and format it for RFC1123
>>>>> the header is not modified with the extra two seconds.
>>>>>
>>>>> So from what I can see httpd does the correct thing, I don't see a bug
>>>>> like in the previous case. What do you think? I am far from a PHP expert
so
>>>>> I might have missed something important :)
>>>>>
>>>>>
>>>> Mmm I don't think it' the right way to compare the dates here as you
>>>> are really comparing the format strings here.
>>>> I propose a new version of the snippet: http://apaste.info/Aox
>>>>
>>>
>>>> Clearly, just changing the timezone doesn't impact the time comparison
>>>> (and it's the expected behaviour).
>>>>
>>>
>>> Correct, in general the best way should be the one that you proposed,
>>> but in this case we are talking about RFC1123 specific date/times, so the
>>> format string comparison should be relevant imho. A efficient RFC 822/1123
>>> parser would probably assume the GMT timezone and care only about what
>>> comes before, this is why Europe/Paris is seen as more recent than GMT. A
>>> super strict and correct parse would also check the GMT bit and return
>>> error if missing, but it may be a bit overkill.
>>>
>>>
>>>> To me there is a wrong attempt to comply with RFC in apache here.
>>>> Either the parser is able to:
>>>> 1. correctly read the header input
>>>> 2. normalize to GMT
>>>> 3. ensure the resulting date is not > to server time (+ probably log
>>>> somthing to help developers to understand things)
>>>> or there should be a warning and the header is dropped (like if it's
>>>> not a date).
>>>>
>>>> Here I thing either step 1 ou 2 are no done properly in apache.
>>>>
>>>>
>>> I am seeing things in a different way, namely only point 3 should/could
>>> be implemented. AFAIU RFC1123 (and related) assume a GMT date/time and
>>> since the HTTP RFC requires this format for the Last-Modified header, I
>>> don't believe that httpd should be required to be able to convert multiple
>>> formats/timezones to RFC1123. This seems to be backed up by the comments of
>>> the function used by httpd to convert the Last-Modified header value:
>>>
>>
>> Ok but current behaviour is not correct either.
>>
>
> From https://tools.ietf.org/html/rfc2616#section-14.29
>
>    An origin server MUST NOT send a Last-Modified date which is later
>    than the server's time of message origination. In such cases, where
>    the resource's last modification would indicate some time in the
>    future, the server MUST replace that date with the message
>    origination date.
>
> It also states that Last-Modified needs to be compliant with RFC 882/1123.
>

Yes, indeed, so it should be the same behaviour if the backend sends
"Last-modified: foo" or "Last modified: Fri, 24 Jun 2016 17:21:46 +0200"

None of them are compliant with RFC 822 or 1123 hence I expect apr/apache
to behave the same way.


>
>> If I understood well assume that apache receives a RFC1123 value so it
>> compares with current server time (and eventually sends the later).
>>
>> In my example, the date is not a valid RFC1123 value (because it sends
>> +0200 or Europe/Paris). Here, the most sensible default would be to trash
>> with value.
>> It's as invalid as "foo" in my initial example so from my point of view
>> the behaviour of apache should be the same: discard the header (thanks to
>> your patch) and raise a warning.
>>
>
> We could patch httpd/apr to be super strict but I am not sure if it is
> worth it. In the meantime, I tried to improve logging, would you mind to
> tell me what you think about http://apaste.info/JlZ ?
>

The log is great (minor fix, I believe it's RFC822 and not RFC1123
according to https://tools.ietf.org/html/rfc2616#section-3.3.1

Manuel

Mime
View raw message