httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Stein <>
Subject Re: [PATCH] Rework httpd buckets/filtering
Date Mon, 24 Sep 2001 04:39:39 GMT
On Sun, Sep 23, 2001 at 03:04:54PM -0700, Ryan Bloom wrote:
> On Sunday 23 September 2001 02:53 pm, Greg Stein wrote:
> > On Sun, Sep 23, 2001 at 08:33:07AM -0700, Ryan Bloom wrote:
> > > None of this can be committed until it has passed all of the tests in the
> > > perl test framework. The last time somebody (me) mucked with these
> > > filters it broke all of the CGI script handling. I would also like to
> > > hear from Jeff about this, because when we wrote the filters originally,
> > > we had input filtering use this design, and Jeff and Greg were adamant
> > > that filters needed to be able to return more than they were asked for.
> >
> > Euh... not sure if you typoed. Input filters should *NOT* return more than
> > they were asked for. Never.
> >
> > If a filter did that, it could end up passing data up to a filter that is
> > not responsible for that data. For example, reading past the end of a
> > request, or reading past the headers yet that portion of the body was
> > supposed to go into a newly-inserted unzip or dechunk filter.
> >
> > I can think of more examples, if called upon :-)
> And I am saying, that was the original design, and Jeff Trawick and Greg
> Ames

Ah. An unqualified "Greg" in your post :-)  Shouldn't he be OtherGreg,
following in OtherBill's footsteps? hehe...

> posted a couple of examples that proved that design incorrect.

The typical example that I recall was related to decompression filters.

"If my caller asks for 100 bytes, then I read 100 bytes from the lower
level, and decompress to 150 bytes, then my caller is going to receive 150

The correct answer is "return the 100 bytes they asked for, and set the
other 50 aside for the next invocation."

Consider the case where you have a multipart message coming in, and the
entire message is compressed (a Transfer-Encoding is used for the message;
Content-Encoding is used to compress the entities in the multipart). Next,
let us say that each multipart is handled by a different filter (say,
because they are handling different Content-Encodings, such as base64 or

To apply some symbology here: M is the message-body filter which is
decompressing the message according to the Transfer-Encoding header. E1 and
E2 are the entity-related filters, handling each entity in the multipart

Now, let's say that E1 asks for 100 bytes. M returns 150. But E1 *only*
wanted 100 bytes. Those other fifty were supposed to go to E2.

Oops :-)

Instead, M should decompress the incoming data, return 100 (to E1) and put
the other 50 aside for the next filter invocation. E2 comes along and asks
for 300 bytes. We return the 50 we had set aside.

(M might decide to read more data, but it is also allowed to return less
than asked for, assuming the caller will simply ask again for more)

If Jeff/OtherGreg :-) have an example where returning more than asked is
*required*, then it should be brought up again. I can't imagine a case where
it would ever be safe to return more than is desired.

[ other examples may relate to character decoding, where you want to avoid
  returning a partial multibyte character, so you return N+3 to get the rest
  of the character in the response; I say that you return N-1 and put rest
  of the character aside for the next invocation ]


Greg Stein,

View raw message