httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Justin Erenkrantz <jerenkra...@apache.org>
Subject Re: PHP POST handling
Date Wed, 02 Oct 2002 01:22:05 GMT
--On Tuesday, October 1, 2002 3:26 PM -0700 Greg Stein <gstein@lyra.org> 
wrote:

> So you're saying that if I do "file upload" to a PHP script, and upload a
> 10 megabyte file, then it is going to spool that whole mother into memory?

Yup.

> Oh oh... even better. Let's just say that the PHP script isn't even
> *thinking* about handling a request body. Maybe it is only set up for a
> GET request. Or maybe it *is* set up for POST, but only for FORM
> contents. But Mr Attacker comes along and throws 1 gigabyte into the
> request. What then? DoS? Swap hell on the server?

The PHP input filter will always read the body and allocate the space - 
irrespective of what the real script desires.

In fact, looking at the code, I believe PHP will only free the memory if 
the script reads the body (do all scripts read the entire body?).  So, a 
GET with a body (perfectly valid) may introduce memory leakage.  PHP uses 
malloc/realloc/free because it wants the body in one contiguous chunk - 
therefore, our pools don't help.

> I think a filter *can* read the request body (i.e. the content generator
> loads a PHP script, PHP runs it (as the first filter), reads the body, and
> loads content from a database). But that implies that the request body
> should not have been thrown out in the default handler.

Correct.  At one point, I submitted a patch to the PHP lists to do exactly 
that, but once we rearranged how we discard bodies, this method couldn't 
work.

The problem we had was when to 'discard' the body - we originally had it 
discarding at the end, but in order to properly handle 413s, we have to 
discard the body before generating the response.  That's fairly new 
behavior on our part, but one I think that brings us in line with the 
desires of the RFC.

Otherwise, we could have a 200 and then find out that it really should have 
been a 413 (because the body is too large).  Therefore, we have to process 
the body before generating any content.  And, since we now allow chunked 
encoding almost everywhere, we do have to read the entire body to know if 
it exceeds our limit.  1.3 chickened out on this and forbade 
chunked-encoding on request bodies.

> But it almost seems cleaner to say there is a series of stages which
> perform the request handling: process the body and generate the (initial)
> content. These stages could load a script from somewhere, run it,
> (repeat) and generate the content into the filter stack.
>
> Right now, we are confusing a *script* with *content*.

I think the problem is that we aren't doing a good job of getting the 
script the content it (may) need.  While it could be interesting to try to 
separate reading and writing in Apache, certainly the PHP language doesn't 
support that (as I believe you can write and then read the body).  So, I'm 
not sure that we can split it out into multiple phases in an effective 
manner.  Reading and writing in PHP (or any CGI script) is just too 
intertwined to support this.

I think we're sorta stuck, but I might be wrong.  -- justin

Mime
View raw message