httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject Re: input filter commentary
Date Tue, 10 Oct 2000 13:57:40 GMT

> I'm probably missing something big, but I'd like to think that where
> http_filter() sits now we'd have one of several pieces of code
> (filter) depending on the state and the method of sending the body:
> header state:
>      a piece of code that knows [CR]LF and where the header ends
>      http_filter() already knows how to do this.
> body state where content-length was provided:
>      a piece of code that knows when the body ends based on content
>      length and number of bytes delivered to the filter above
>      http_filter() already knows how to do this.
> body state where body is chunked and app doesn't want to see chunks:
>      a dechunk filter
> body state where body is chunked but app wants chunks passed through:
>      a read-chunk filter that reads chunk headers to know where the
>      body ends but doesn't remove the chunk headers/trailers
>      I don't know if we really need to support this, but 1.3 has this
>      functionality.  This breaks when any further filtering is
>      performed, with the exception of filtering which maintains the
>      original length.
> any other state: 
>      no body, stay ready to receive the next header
> http_filter() could always switch on the state and call the right
> piece of code, but I don't see any insurmountable problems associated
> with them being separate filters, only one of which is installed at a
> time.
> As long as http_filter() has all of these responsibilities, playing
> with different combinations of transport encodings is messy.

This isn't possible, it was my first design, and it just doesn't
work.  First of all, figuring out which state you are in is hard at the
http_filter level, because you really don't know what the headers look
like.  In the future, I expect http_filter to be the thing that actually
generates headers_in, and then all the other filters will be installed on
top of that.

The problem with having four different filters, is where does the brigade
go when we are switching filters?  Since the brigade has to sit in
sombody's ctx pointer, it needs to be associated with a filter.  Because
the core_filter doesn't know how much data to pass up at a time, the
brigade will almost always end up sitting in http_filter's ctx
pointer.  If you try to remove that filter the brigade disappears, and you
can't continue the request.

As far as http_filter switching on the state, that requires http_filter
really knowing the protocol, because otherwise it can't determine what
state it is in.  There is no communication between the request_rec and the
http_filter.  Since the request_rec is the thing that knows the state, the
http_filter doesn't.  The only solution, is to add an http_body filter,
which is also a bit buggy, because that assumes that http_filter will
never be called during body processing, which is just incorrect.  As soon
as we need data from the socket, we will call back down the stack, and
http_filter will need to know if we are in body or header state.

> > *) http_filter cannot guarantee that it is returning a full line for
> >    getline(). Thus, getline() must have a loop to fetch data. It scans
> >    through the data looking for the LF (and could then do the CR/LF munging
> >    for getline callers), so it is unclear why http_filter is doing the
> >    munging.
> Really, only one piece of code needs to know what a header looks
> like.  getline() can solve the whole problem as long as there is a way
> for it to put back any data that it has grabbed after the header.  As
> long as http_filter() and getline() both have responsibilities
> regarding the header then reading the header is going to be less than
> optimal.

This is just wrong.  I'm sorry.  The filters were always designed to be
one way and one way only.  It is not possible for our filters to push data
back down.  The thing is, that currently, we have a hack because I want to
make progress.  Getline is bogus.  http_filter should be the entity
creating the request_rec, and filling it out.  That solves this whole
issue, because then http_filter could actually really easily determine
which state it is in, and it can act accordingly.  Plus, it makes adding
different protocols as easy as replaing http_filter.

> I'm certainly interested in the input xlate filter, but I guess I'm
> more interested in nice ways for any filter to get inserted based on
> the requested URI.  When that happens, then the input xlate filter
> doesn't have to have any special smarts.
> At the end of last week when this issue was touched on I was not able
> to understand what problems keep us from being able to add filters
> after we read the request header.

Those issues are gone now.  The http_filter solved them.  Please read what
I wrote above, and then let me know if it still doesn't make sense as to
what some of the issues are here.  The input filtering is non-intuitive,
because we are operating on data on the way back up the stack.  The
problem with that, is that it makes it much harder to add and remove
filters while filtering data.

> If we can't add filters particular to the requested URI then we
> cripple input filtering.  It isn't cool to have to build a giant
> filter list statically and have filter instances within the list
> enable/disable themselves on a per request basis.  Consider multiple
> arrangements for a pair of filters and what that means to the static
> filter list.

We can add/remove filters based on URI, but only at certain times.  The
problem with your patch from last week, is that it makes input filtering
look like output filtering, even though they are dramatically different to
the end user.

Ryan Bloom               
406 29th St.
San Francisco, CA 94131

View raw message