httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luca Toscano <toscano.l...@gmail.com>
Subject Re: [users@httpd] Last-Modified header overridden
Date Mon, 27 Jun 2016 12:17:38 GMT
2016-06-27 13:17 GMT+02:00 Vacelet, Manuel <manuel.vacelet@enalean.com>:

>
>
> On Sat, Jun 25, 2016 at 11:00 AM, Luca Toscano <toscano.luca@gmail.com>
> wrote:
>
>>
>>
>> 2016-06-24 17:26 GMT+02:00 Vacelet, Manuel <manuel.vacelet@enalean.com>:
>>
>>>
>>>
>>> On Sun, Jun 19, 2016 at 3:17 PM, Luca Toscano <toscano.luca@gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> 2016-06-08 16:14 GMT+02:00 Vacelet, Manuel <manuel.vacelet@enalean.com>
>>>> :
>>>>
>>>>> On Tue, Jun 7, 2016 at 11:02 PM, Luca Toscano <toscano.luca@gmail.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> 2016-06-07 10:55 GMT+02:00 Vacelet, Manuel <
>>>>>> manuel.vacelet@enalean.com>:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jun 6, 2016 at 5:32 PM, Vacelet, Manuel <
>>>>>>> manuel.vacelet@enalean.com> wrote:
>>>>>>>
>>>>>>>> dOn Mon, Jun 6, 2016 at 5:00 PM, Vacelet, Manuel <
>>>>>>>> manuel.vacelet@enalean.com> wrote:
>>>>>>>>
>>>>>>>>> On Mon, Jun 6, 2016 at 4:09 PM, Luca Toscano <
>>>>>>>>> toscano.luca@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I was able to repro building httpd from 2.4.x branch
and
>>>>>>>>>> following your configuration files on github. I am
almost sure that
>>>>>>>>>> somewhere httpd sets the Last-Modified header translating
"foo" to the
>>>>>>>>>> first Jan 1970 date. I realized though that I didn't
recall the real issue,
>>>>>>>>>> since passing value not following the RFC can lead
to inconsistencies, so I
>>>>>>>>>> went back and checked the correspondence. Quoting:
>>>>>>>>>>
>>>>>>>>>> "Actually I wrote this snippet to highlight the behaviour
(the
>>>>>>>>>> original code sent the date in iso8601 instead of
rfc1123) because it was
>>>>>>>>>> more obvious.
>>>>>>>>>> During my tests (this is extracted from an automated
test suite),
>>>>>>>>>> even after having converted dates to rfc1123, I continued
to get some
>>>>>>>>>> sparse errors. What I got is that the value I sent
was sometimes slightly
>>>>>>>>>> modified (a second or 2) depending on the machine
load."
>>>>>>>>>>
>>>>>>>>>> So my understanding is that you would like to know
why a
>>>>>>>>>> Last-Modified header with a legitimate date/time
set by a PHP app gets
>>>>>>>>>> "delayed" by a couple of seconds from httpd, right?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yes for sure, this is the primary issue.
>>>>>>>>> However, the (undocumented) difference of behavior from
one
>>>>>>>>> version to another (2.2 -> 2.4 and more surprisingly
from between two 2.4
>>>>>>>>> versions) is also in question here.
>>>>>>>>> Even more strange, 2.4 built for other distrib doesn't
highlight
>>>>>>>>> the behaviour !
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> I made another series of test and it seems to be linked to
fastcgi.
>>>>>>>>
>>>>>>>> I took the stock apache (2.4.6 plus tons of patches)  &
php-fpm
>>>>>>>> (5.4.16 + tons of patches) from RHEL7 and I get the exact
same behaviour
>>>>>>>> (headers rewritten to EPOCH)
>>>>>>>> However, if I server the very same php script from mod_php
(instead
>>>>>>>> of fcgi) it "works" (the headers are not modified).
>>>>>>>>
>>>>>>>>
>>>>>>> For the record, I also have the same behaviour (headers rewritten
>>>>>>> when using php-fpm + fastcgi) on alpine linux 3.4 that ships
apache2-2.4.20.
>>>>>>> So AFAICT, it doesn't seem distro specific.
>>>>>>>
>>>>>>> On the root of the problem, from my point of view:
>>>>>>> - the difference between mod_php vs. php-fpm + fcgi is
>>>>>>> understandable (even if not desired and not documented).
>>>>>>> - the fact that fcgi handler parse & rewrite headers seems
to lead
>>>>>>> to inconsistencies (I'll try to build a test case for that).
>>>>>>> - however, even if the headers are wrong, I think apache default
>>>>>>> (use EPOCH) is wrong as it leads to very inconsistent behaviour
(the
>>>>>>> resource will never expire). I would prefer either:
>>>>>>> -- do not touch the header
>>>>>>> -- raise a warning and discard the header
>>>>>>>
>>>>>>> What do you think ?
>>>>>>>
>>>>>>
>>>>>>
>>>>>> From my tests the following snippet of code should be responsible
for
>>>>>> the switch from 'foo' to unix epoch:
>>>>>>
>>>>>> *https://github.com/apache/httpd/blob/2.4.x/server/util_script.c#L663
>>>>>> <https://github.com/apache/httpd/blob/2.4.x/server/util_script.c#L663>*
>>>>>>
>>>>>> The function that contains the code,
>>>>>> ap_scan_script_header_err_core_ex, is wrapped by a lot of other functions
>>>>>> eventually called by modules like mod-proxy-fcgi. A more verbose
>>>>>> description of the function in:
>>>>>>
>>>>>> https://github.com/apache/httpd/blob/2.4.x/include/util_script.h#L200
>>>>>>
>>>>>> Not sure what would be the best thing to do, but probably we could
>>>>>> follow up in a official apache bugzilla task?
>>>>>> https://bz.apache.org/bugzilla/enter_bug.cgi?product=Apache%20httpd-2
>>>>>>
>>>>>>
>>>>> Wow, thanks for the investigation !
>>>>>
>>>>
>>>> Sorry for the delay! I submitted a patch for trunk with a possible fix,
>>>> namely dropping (and logging at trace1 level) any non compliant date/time
>>>> set in a Last-Modified header returned by a FCGI/CGI script:
>>>> http://svn.apache.org/r1748379
>>>>
>>>>
>>> Cool :)
>>>
>>>
>>>> The fix is also in the list of proposal for backport to the 2.4.x
>>>> branch, we'll see what other people think about this solution.
>>>>
>>>> We should also do a follow up for the other main issue, namely the fact
>>>> that you see a different/delayed Last-Modified header sometimes among your
>>>> FCGI/httpd responses. Can you give me an example of Last-Modified header
>>>> value before/after the "delay" and a way to repro it?
>>>>
>>>
>>> I wrote a test case in the "time" branch:
>>> https://github.com/vaceletm/bug-httpd24/tree/time
>>>
>>> In my own tests, I get:
>>> --------------------->8---------------------
>>> < Date: Fri, 24 Jun 2016 15:21:46 GMT
>>> < Server: Apache/2.4.18 (Red Hat)
>>> < X-Powered-By: PHP/5.6.5
>>> < Last-Modified: Fri, 24 Jun 2016 15:21:48 GMT
>>> < Transfer-Encoding: chunked
>>> < Content-Type: text/html; charset=UTF-8
>>> <
>>> { [data not shown]
>>>   0    44    0    44    0     0     21      0 --:--:--  0:00:02 --:--:--
>>>    21* Connection #0 to host localhost left intact
>>>
>>> * Closing connection #0
>>> sent value: Fri, 24 Jun 2016 17:21:46 +0200
>>> --------------------->8---------------------
>>>
>>> The value sent doesn't respect RFC1123 (+0200 instead of GMT as time
>>> zone) but the result is weird as you can see:
>>> - I sent "Fri, 24 Jun 2016 17:21:46 +0200"
>>> - but apache decided to send "Fri, 24 Jun 2016 15:21:48 GMT"
>>>
>>> Notice the 2 seconds ?
>>> I put a "sleep(2)" in my php script...
>>>
>>> I don't know if your fix also take this into account
>>>
>>
>> Thanks a lot for the precise test! The same code snippet that I modified
>> is responsible for the behavior that you mentioned. Httpd modifies the
>> Last-Modified header with the request's modification time if the value sent
>> from FCGI appears to be in the future (since the HTTP RFC states "An origin
>> server with a clock MUST NOT send a Last-Modified date that is later than
>> the server's time of message origination (Date).").
>>
>> I modified your PHP code snippet (http://apaste.info/EEz) trying to
>> compare a GMT date vs a "Europe/Paris" one, already formatted for RFC1123,
>> and PHP seems to agree with httpd in recognizing the "Europe/Paris" date as
>> more recent. Moreover, if you generate a GMT date and format it for RFC1123
>> the header is not modified with the extra two seconds.
>>
>> So from what I can see httpd does the correct thing, I don't see a bug
>> like in the previous case. What do you think? I am far from a PHP expert so
>> I might have missed something important :)
>>
>>
> Mmm I don't think it' the right way to compare the dates here as you are
> really comparing the format strings here.
> I propose a new version of the snippet: http://apaste.info/Aox
>

> Clearly, just changing the timezone doesn't impact the time comparison
> (and it's the expected behaviour).
>

Correct, in general the best way should be the one that you proposed, but
in this case we are talking about RFC1123 specific date/times, so the
format string comparison should be relevant imho. A efficient RFC 822/1123
parser would probably assume the GMT timezone and care only about what
comes before, this is why Europe/Paris is seen as more recent than GMT. A
super strict and correct parse would also check the GMT bit and return
error if missing, but it may be a bit overkill.


> To me there is a wrong attempt to comply with RFC in apache here.
> Either the parser is able to:
> 1. correctly read the header input
> 2. normalize to GMT
> 3. ensure the resulting date is not > to server time (+ probably log
> somthing to help developers to understand things)
> or there should be a warning and the header is dropped (like if it's not a
> date).
>
> Here I thing either step 1 ou 2 are no done properly in apache.
>
>
I am seeing things in a different way, namely only point 3 should/could be
implemented. AFAIU RFC1123 (and related) assume a GMT date/time and since
the HTTP RFC requires this format for the Last-Modified header, I don't
believe that httpd should be required to be able to convert multiple
formats/timezones to RFC1123. This seems to be backed up by the comments of
the function used by httpd to convert the Last-Modified header value:

https://github.com/apache/apr/blob/72d7d0922949f47d58574c3d638d9d8c30a08e3f/util-misc/apr_date.c#L98


I do agree with you that it would be awesome to have these kind of issues
sorted out directly by httpd, but it is also true that we shouldn't
consider it as catch-all corrector for non RFC compliant HTTP responses
coming from upstream (even if in this case it corrects the wrong value with
a compliant one).

This is my view of things, really curious to know other opinions from the
mailing list!

Thanks again!

Luca

Mime
View raw message