httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacob Champion <champio...@gmail.com>
Subject Re: svn commit: r1750953 - /httpd/httpd/trunk/server/util_script.c
Date Mon, 01 Aug 2016 19:13:57 GMT
All right, getting back to this after a week off. I've tried to combine 
feedback as best I can into one message.

Bill, you wrote:

> I'm perfectly happy to translate such values to GMT for non-HTTP
> inputs, per spec. If we are going to do so for HTTP inputs, loudly
> scolding the errant developer in the logs seems prudent, for their own
> longer-term benefit.

I suspect this is the core of the argument now. In my opinion, CGI is 
not an HTTP input -- either as far as the spec is concerned (for spec 
lawyering purposes) or in practice. It is a separate (HTTP-like) 
protocol, and an implementation detail of the server. In other words, we 
are not "proxying" to a CGI app; the CGI app is contained by the server 
and is providing inputs to the server response.

That is a separate argument from "is it wise to correct GMT timestamps", 
though.

> 1. Why do you/reporter want HTTP applications to persist in writing code
> which breaks between different transport providers/cgi hosting environments?
> The language has been crystal clear for 2 decades. We do a huge disservice
> to the PHP author community to let them be idiots. Alternately, the PHP
> SAPI itself could rectify this. (We aren't talking about non-HTTP sources.)

I'm not sure where PHP enters the conversation. They are only one (large 
and important!) CGI producer; we're talking about our behavior with 
*all* CGI applications here.

I do like your argument that we should do as little transformation as 
possible, in order to facilitate moving CGI apps between environments. 
Implementation differences are nasty for everyone. But I'm not convinced 
that ship hasn't sailed; currently, it looks like we modify outgoing CGI 
responses in order to merge headers, normalize Content-Type, and produce 
Unmodified and Precondition Failed responses.

There may be others I have missed, but this doesn't look like the 
behavior of a server that considers itself a transparent "passthrough" 
to a CGI application. (Isn't that what CGI-NPH is for?) But! I could 
definitely be swayed otherwise, if that's what we'd like to do moving 
forwards. I think both sides have potential value, but we should choose one.

> If there is date input that we cannot handle, the
> spec strongly encourages us to interpret it as now(), provided we have a
> clock (which all of our architectures do.)

In the absence of a quote from the spec, I'm still in strong 
disagreement with this, based on the language I quoted last week.

Moving on to Stefan's comments:

> If we see CGI as a kind of input that is not strictly regulated by
> HTTP header formats (and that is an if), we should correct timezone
> offset to GMT, but otherwise leave the time unchanged. It might be our
> clock that has the issue. Meddling with it will not help anyone
> debugging problems.

+1 (and I am currently of the opinion that CGI is not a strict HTTP 
input, as stated above).

> If the value is unparseable, we should log it and suppress sending
> outa "Last-Modified" completely. Also any "If-*" checking should
> behave as if the header was not present.

+1.

> The alternative is to expect the CGI to honor HTTP/1.1 header
> semantics, pass values unchanged and let CGI and client run into
> misunderstandings immediately.

Practically, I'm not super opposed to this alternative (but if we choose 
it, we should apply it consistently). If I put on spec-lawyer hat, the 
CGI RFC has this to say:

[https://tools.ietf.org/html/rfc3875#section-6.2.1]

>    The server MUST make any appropriate modifications to the script's
>    output to ensure that the response to the client complies with the
>    response protocol version.

So this alternative is not my first choice. Invalid headers should 
really either be corrected (if the correction is obvious, safe, and 
helpful), or dropped entirely. Or the entire response should be 500'd, 
but we run into major compatibility breaks if we choose that route.

And finally, from the latest patch from Luca:

> 2) Some comments have been added in the code to state clearly that
> anynon compliant datetime strings will not be interpreted or re-formatted.

As stated above, this is not my first choice -- but I wouldn't oppose it 
if that's what the consensus comes to.

>          else if (!ap_cstr_casecmp(w, "Last-Modified")) {
> -            apr_time_t parsed_date = apr_date_parse_rfc(l);
> +            apr_time_t parsed_date = apr_date_parse_http(l);

apr_date_parse_http() is not good enough; IIUC, it completely ignores 
timezones, which further corrupts non-GMT Last-Modified stamps. We 
either want strict parsing or actual correction, not something in the 
middle.

--Jacob

Mime
View raw message