httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Roy T. Fielding" <>
Subject Re: HTTP/1.1 strict ruleset
Date Thu, 04 Aug 2016 20:46:07 GMT
> On Aug 3, 2016, at 4:33 PM, William A Rowe Jr <> wrote:
> So it seems pretty absurd we are coming back to this over
> three years later, but is there any reason to preserve pre-RFC 2068
> behaviors? I appreciate that Stefan was trying to avoid harming
> existing deployment scenarios, but even as I'm about to propose
> that we backport all of this to 2.4 and 2.2, I have several questions;

In general, I don't see a need for any "strict" options. The only changes we
made to parsing in RFC7230 were for the sake of security and known failures
to interoperate. This is exactly the feature we are supposed to be handling
automatically on behalf of our users: secure, correct, and interoperable
handling and generation of HTTP messaging.  We should not need to configure it.

Note that the MUST requirements in RFC7230 are not optional. We either implement
them as specified or we are not compliant with HTTP.  So, the specific issues of

   A sender MUST NOT send whitespace between the start-line and the
   first header field.  A recipient that receives whitespace between the
   start-line and the first header field MUST either reject the message
   as invalid or consume each whitespace-preceded line without further
   processing of it (i.e., ignore the entire line, along with any
   subsequent lines preceded by whitespace, until a properly formed
   header field is received or the header section is terminated).

   The presence of such whitespace in a request might be an attempt to
   trick a server into ignoring that field or processing the line after
   it as a new request, either of which might result in a security
   vulnerability if other implementations within the request chain
   interpret the same message differently.  Likewise, the presence of
   such whitespace in a response might be ignored by some clients or
   cause others to cease parsing.


   No whitespace is allowed between the header field-name and colon.  In
   the past, differences in the handling of such whitespace have led to
   security vulnerabilities in request routing and response handling.  A
   server MUST reject any received request message that contains
   whitespace between a header field-name and colon with a response code
   of 400 (Bad Request).  A proxy MUST remove any such whitespace from a
   response message before forwarding the message downstream.

must be complied with regardless of any "strict" config setting.

Some of those other things under "strict" seem a bit wonky. For example,
changing the Host header field when the incoming request URI is absolute
is fine by default but needs to be a configurable option for gateways.
Trying to validate IPv4/6 vs DNS doesn't work in intranet environments
that use local name servers.  The Location field-value is no longer required
to be absolute ("").

> 1. offer a logging-only option? Why? It seems like a simple
>    choice, follow the spec, or don't. If you want to see what's
>    going on, Wireshark, Fiddler and dozens of other tools let
>    you inspect the conversation.
> 2. leave the default as 'not-strict'? Seems we should most
>    strongly recommend that the server observe RFC's 2068,
>    2616 and 723x, and not tolerate ancient behavior by default
>    unless the admin insists on being foolish.

As far as the Internet is concerned, RFC723x is the new law of the land.
There is no reason to support obsolete RFCs.  No reason at all.  This has
nothing to do with semantic versioning or binary compatibility -- it is
simply doing what the product says it does: serve HTTP.

> 3. retain these legacy faulty behaviors in httpd
>    Seems that once we agree on a backport, the ancient
>    side of this logic should all just disappear from trunk.
> 4. detail the error to the error log? Again, there are inspection
>    tools, but more importantly, no visual user-agent is going
>    to send this garbage, and automated requests are going
>    to discard the 400 response. Seems we can save a lot of
>    code throwing away the details that just don't help, and
>    are generally the product of abusive traffic.
> Thoughts?

I think we just need to state in the log the reason for a 400 error. I don't like
sending invalid client-provided data back in a response, even when encoded.

Whitespace before the first header field can log a static message.
Whitespace after a field-name could log the field-name (no need to log the
field value). Invalid characters can be noted as "in a field-name" without
further data, or as "in a field-value" with only the field-name logged.

These are all post-error details off the critical path, so I don't buy the CPU
argument.  However, I do think our error handling in protocol.c has become
so verbose that it obscures the rest of the code.  Maybe it would be better if
we just stopped caring about 80-column viewing for calls to ap_log_*.


View raw message