httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vacelet, Manuel" <manuel.vace...@enalean.com>
Subject Re: [users@httpd] Last-Modified header overridden
Date Fri, 01 Jul 2016 14:53:04 GMT
On Fri, Jul 1, 2016 at 4:14 PM, Luca Toscano <toscano.luca@gmail.com> wrote:

>
>
> 2016-06-29 9:38 GMT+02:00 Luca Toscano <toscano.luca@gmail.com>:
>
>>
>>
>> 2016-06-28 18:32 GMT+02:00 Luca Toscano <toscano.luca@gmail.com>:
>>
>>>
>>>
>>> 2016-06-27 14:52 GMT+02:00 Vacelet, Manuel <manuel.vacelet@enalean.com>:
>>>
>>>>
>>>>
>>>> On Mon, Jun 27, 2016 at 2:17 PM, Luca Toscano <toscano.luca@gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> 2016-06-27 13:17 GMT+02:00 Vacelet, Manuel <manuel.vacelet@enalean.com
>>>>> >:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, Jun 25, 2016 at 11:00 AM, Luca Toscano <
>>>>>> toscano.luca@gmail.com> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2016-06-24 17:26 GMT+02:00 Vacelet, Manuel <
>>>>>>> manuel.vacelet@enalean.com>:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sun, Jun 19, 2016 at 3:17 PM, Luca Toscano <
>>>>>>>> toscano.luca@gmail.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2016-06-08 16:14 GMT+02:00 Vacelet, Manuel <
>>>>>>>>> manuel.vacelet@enalean.com>:
>>>>>>>>>
>>>>>>>>>> On Tue, Jun 7, 2016 at 11:02 PM, Luca Toscano <
>>>>>>>>>> toscano.luca@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2016-06-07 10:55 GMT+02:00 Vacelet, Manuel <
>>>>>>>>>>> manuel.vacelet@enalean.com>:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Jun 6, 2016 at 5:32 PM, Vacelet,
Manuel <
>>>>>>>>>>>> manuel.vacelet@enalean.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> dOn Mon, Jun 6, 2016 at 5:00 PM, Vacelet,
Manuel <
>>>>>>>>>>>>> manuel.vacelet@enalean.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Jun 6, 2016 at 4:09 PM, Luca
Toscano <
>>>>>>>>>>>>>> toscano.luca@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I was able to repro building
httpd from 2.4.x branch and
>>>>>>>>>>>>>>> following your configuration
files on github. I am almost sure that
>>>>>>>>>>>>>>> somewhere httpd sets the Last-Modified
header translating "foo" to the
>>>>>>>>>>>>>>> first Jan 1970 date. I realized
though that I didn't recall the real issue,
>>>>>>>>>>>>>>> since passing value not following
the RFC can lead to inconsistencies, so I
>>>>>>>>>>>>>>> went back and checked the correspondence.
Quoting:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> "Actually I wrote this snippet
to highlight the behaviour
>>>>>>>>>>>>>>> (the original code sent the date
in iso8601 instead of rfc1123) because it
>>>>>>>>>>>>>>> was more obvious.
>>>>>>>>>>>>>>> During my tests (this is extracted
from an automated test
>>>>>>>>>>>>>>> suite), even after having converted
dates to rfc1123, I continued to get
>>>>>>>>>>>>>>> some sparse errors. What I got
is that the value I sent was sometimes
>>>>>>>>>>>>>>> slightly modified (a second or
2) depending on the machine load."
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So my understanding is that you
would like to know why a
>>>>>>>>>>>>>>> Last-Modified header with a legitimate
date/time set by a PHP app gets
>>>>>>>>>>>>>>> "delayed" by a couple of seconds
from httpd, right?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yes for sure, this is the primary
issue.
>>>>>>>>>>>>>> However, the (undocumented) difference
of behavior from one
>>>>>>>>>>>>>> version to another (2.2 -> 2.4
and more surprisingly from between two 2.4
>>>>>>>>>>>>>> versions) is also in question here.
>>>>>>>>>>>>>> Even more strange, 2.4 built for
other distrib doesn't
>>>>>>>>>>>>>> highlight the behaviour !
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I made another series of test and it
seems to be linked to
>>>>>>>>>>>>> fastcgi.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I took the stock apache (2.4.6 plus tons
of patches)  &
>>>>>>>>>>>>> php-fpm (5.4.16 + tons of patches) from
RHEL7 and I get the exact same
>>>>>>>>>>>>> behaviour (headers rewritten to EPOCH)
>>>>>>>>>>>>> However, if I server the very same php
script from mod_php
>>>>>>>>>>>>> (instead of fcgi) it "works" (the headers
are not modified).
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>> For the record, I also have the same behaviour
(headers
>>>>>>>>>>>> rewritten when using php-fpm + fastcgi) on
alpine linux 3.4 that
>>>>>>>>>>>> ships apache2-2.4.20.
>>>>>>>>>>>> So AFAICT, it doesn't seem distro specific.
>>>>>>>>>>>>
>>>>>>>>>>>> On the root of the problem, from my point
of view:
>>>>>>>>>>>> - the difference between mod_php vs. php-fpm
+ fcgi is
>>>>>>>>>>>> understandable (even if not desired and not
documented).
>>>>>>>>>>>> - the fact that fcgi handler parse &
rewrite headers seems to
>>>>>>>>>>>> lead to inconsistencies (I'll try to build
a test case for that).
>>>>>>>>>>>> - however, even if the headers are wrong,
I think apache
>>>>>>>>>>>> default (use EPOCH) is wrong as it leads
to very inconsistent behaviour
>>>>>>>>>>>> (the resource will never expire). I would
prefer either:
>>>>>>>>>>>> -- do not touch the header
>>>>>>>>>>>> -- raise a warning and discard the header
>>>>>>>>>>>>
>>>>>>>>>>>> What do you think ?
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> From my tests the following snippet of code should
be
>>>>>>>>>>> responsible for the switch from 'foo' to unix
epoch:
>>>>>>>>>>>
>>>>>>>>>>> *https://github.com/apache/httpd/blob/2.4.x/server/util_script.c#L663
>>>>>>>>>>> <https://github.com/apache/httpd/blob/2.4.x/server/util_script.c#L663>*
>>>>>>>>>>>
>>>>>>>>>>> The function that contains the code,
>>>>>>>>>>> ap_scan_script_header_err_core_ex, is wrapped
by a lot of other functions
>>>>>>>>>>> eventually called by modules like mod-proxy-fcgi.
A more verbose
>>>>>>>>>>> description of the function in:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> https://github.com/apache/httpd/blob/2.4.x/include/util_script.h#L200
>>>>>>>>>>>
>>>>>>>>>>> Not sure what would be the best thing to do,
but probably we
>>>>>>>>>>> could follow up in a official apache bugzilla
task?
>>>>>>>>>>> https://bz.apache.org/bugzilla/enter_bug.cgi?product=Apache%20httpd-2
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> Wow, thanks for the investigation !
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Sorry for the delay! I submitted a patch for trunk with
a possible
>>>>>>>>> fix, namely dropping (and logging at trace1 level) any
non compliant
>>>>>>>>> date/time set in a Last-Modified header returned by a
FCGI/CGI script:
>>>>>>>>> http://svn.apache.org/r1748379
>>>>>>>>>
>>>>>>>>>
>>>>>>>> Cool :)
>>>>>>>>
>>>>>>>>
>>>>>>>>> The fix is also in the list of proposal for backport
to the 2.4.x
>>>>>>>>> branch, we'll see what other people think about this
solution.
>>>>>>>>>
>>>>>>>>> We should also do a follow up for the other main issue,
namely the
>>>>>>>>> fact that you see a different/delayed Last-Modified header
sometimes among
>>>>>>>>> your FCGI/httpd responses. Can you give me an example
of Last-Modified
>>>>>>>>> header value before/after the "delay" and a way to repro
it?
>>>>>>>>>
>>>>>>>>
>>>>>>>> I wrote a test case in the "time" branch:
>>>>>>>> https://github.com/vaceletm/bug-httpd24/tree/time
>>>>>>>>
>>>>>>>> In my own tests, I get:
>>>>>>>> --------------------->8---------------------
>>>>>>>> < Date: Fri, 24 Jun 2016 15:21:46 GMT
>>>>>>>> < Server: Apache/2.4.18 (Red Hat)
>>>>>>>> < X-Powered-By: PHP/5.6.5
>>>>>>>> < Last-Modified: Fri, 24 Jun 2016 15:21:48 GMT
>>>>>>>> < Transfer-Encoding: chunked
>>>>>>>> < Content-Type: text/html; charset=UTF-8
>>>>>>>> <
>>>>>>>> { [data not shown]
>>>>>>>>   0    44    0    44    0     0     21      0 --:--:--  0:00:02
>>>>>>>> --:--:--    21* Connection #0 to host localhost left intact
>>>>>>>>
>>>>>>>> * Closing connection #0
>>>>>>>> sent value: Fri, 24 Jun 2016 17:21:46 +0200
>>>>>>>> --------------------->8---------------------
>>>>>>>>
>>>>>>>> The value sent doesn't respect RFC1123 (+0200 instead of
GMT as
>>>>>>>> time zone) but the result is weird as you can see:
>>>>>>>> - I sent "Fri, 24 Jun 2016 17:21:46 +0200"
>>>>>>>> - but apache decided to send "Fri, 24 Jun 2016 15:21:48 GMT"
>>>>>>>>
>>>>>>>> Notice the 2 seconds ?
>>>>>>>> I put a "sleep(2)" in my php script...
>>>>>>>>
>>>>>>>> I don't know if your fix also take this into account
>>>>>>>>
>>>>>>>
>>>>>>> Thanks a lot for the precise test! The same code snippet that
I
>>>>>>> modified is responsible for the behavior that you mentioned.
Httpd modifies
>>>>>>> the Last-Modified header with the request's modification time
if the value
>>>>>>> sent from FCGI appears to be in the future (since the HTTP RFC
states "An
>>>>>>> origin server with a clock MUST NOT send a Last-Modified date
that is later
>>>>>>> than the server's time of message origination (Date).").
>>>>>>>
>>>>>>> I modified your PHP code snippet (http://apaste.info/EEz) trying
to
>>>>>>> compare a GMT date vs a "Europe/Paris" one, already formatted
for RFC1123,
>>>>>>> and PHP seems to agree with httpd in recognizing the "Europe/Paris"
date as
>>>>>>> more recent. Moreover, if you generate a GMT date and format
it for RFC1123
>>>>>>> the header is not modified with the extra two seconds.
>>>>>>>
>>>>>>> So from what I can see httpd does the correct thing, I don't
see a
>>>>>>> bug like in the previous case. What do you think? I am far from
a PHP
>>>>>>> expert so I might have missed something important :)
>>>>>>>
>>>>>>>
>>>>>> Mmm I don't think it' the right way to compare the dates here as
you
>>>>>> are really comparing the format strings here.
>>>>>> I propose a new version of the snippet: http://apaste.info/Aox
>>>>>>
>>>>>
>>>>>> Clearly, just changing the timezone doesn't impact the time
>>>>>> comparison (and it's the expected behaviour).
>>>>>>
>>>>>
>>>>> Correct, in general the best way should be the one that you proposed,
>>>>> but in this case we are talking about RFC1123 specific date/times, so
the
>>>>> format string comparison should be relevant imho. A efficient RFC 822/1123
>>>>> parser would probably assume the GMT timezone and care only about what
>>>>> comes before, this is why Europe/Paris is seen as more recent than GMT.
A
>>>>> super strict and correct parse would also check the GMT bit and return
>>>>> error if missing, but it may be a bit overkill.
>>>>>
>>>>>
>>>>>> To me there is a wrong attempt to comply with RFC in apache here.
>>>>>> Either the parser is able to:
>>>>>> 1. correctly read the header input
>>>>>> 2. normalize to GMT
>>>>>> 3. ensure the resulting date is not > to server time (+ probably
log
>>>>>> somthing to help developers to understand things)
>>>>>> or there should be a warning and the header is dropped (like if it's
>>>>>> not a date).
>>>>>>
>>>>>> Here I thing either step 1 ou 2 are no done properly in apache.
>>>>>>
>>>>>>
>>>>> I am seeing things in a different way, namely only point 3
>>>>> should/could be implemented. AFAIU RFC1123 (and related) assume a GMT
>>>>> date/time and since the HTTP RFC requires this format for the Last-Modified
>>>>> header, I don't believe that httpd should be required to be able to convert
>>>>> multiple formats/timezones to RFC1123. This seems to be backed up by
the
>>>>> comments of the function used by httpd to convert the Last-Modified header
>>>>> value:
>>>>>
>>>>
>>>> Ok but current behaviour is not correct either.
>>>>
>>>
>>> From https://tools.ietf.org/html/rfc2616#section-14.29
>>>
>>>    An origin server MUST NOT send a Last-Modified date which is later
>>>    than the server's time of message origination. In such cases, where
>>>    the resource's last modification would indicate some time in the
>>>    future, the server MUST replace that date with the message
>>>    origination date.
>>>
>>> It also states that Last-Modified needs to be compliant with RFC
>>> 882/1123.
>>>
>>>
>>>> If I understood well assume that apache receives a RFC1123 value so it
>>>> compares with current server time (and eventually sends the later).
>>>>
>>>> In my example, the date is not a valid RFC1123 value (because it sends
>>>> +0200 or Europe/Paris). Here, the most sensible default would be to trash
>>>> with value.
>>>> It's as invalid as "foo" in my initial example so from my point of view
>>>> the behaviour of apache should be the same: discard the header (thanks to
>>>> your patch) and raise a warning.
>>>>
>>>
>>> We could patch httpd/apr to be super strict but I am not sure if it is
>>> worth it. In the meantime, I tried to improve logging, would you mind to
>>> tell me what you think about http://apaste.info/JlZ ?
>>>
>>
>> This one should be clearer: http://apaste.info/8pa
>>
>> I will also follow up with the dev@ mailing list to get other opinions
>> about this bug report.
>>
>>
>>
> Committed logging in trunk  and updated 2.4.x backport proposal:
> http://svn.apache.org/r1750883
>
> The logging message should look like:
>
> [Fri Jul 01 06:12:35.639343 2016] [proxy_fcgi:trace1] [pid 3542:tid
> 140561097561856] util_script.c(688): [client ::1:52261] The Last-Modified
> header value 'Fri, 01 Jul 2016 08:12:33 GMT' (parsed as RFC882/RFC1123
> datetime, that assumes the GMT timezone) has been replaced with: 'Fri, 01
> Jul 2016 06:12:35 GMT'. An origin server with a clock must not send a
> Last-Modified date that is later than the server's time of message
> origination.
>
> Thanks a lot for the bug report!
>
>
Thanks for fixing it !

However it's RFC822 and not 882 (882 is for domain names:
https://tools.ietf.org/html/rfc882)

Mime
View raw message