httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject Re: cvs commit: apache-2.0/src/main http_connection.c http_core.c http_protocol.c http_request.c util_filter.c
Date Thu, 05 Oct 2000 18:40:56 GMT

> > Does this make sense?  How do we add input filters for a request?  We
> > don't even have the request until after the input filters have been
> > run.  
> It should be o.k. to run through connection filters before we have set
> up request filters.  That is o.k. on output (for second response)...
> Why not on input?

All input filters are request filters.  That's what I'm trying to
explain.  It isn't possible to add an input content filter.  At least, not
that I can see working everytime.

> >       If you are thinking of the request body, that won't work.  Most
> > browsers send the request body in the same packet as the request headers,
> > so the body will actually be stored in the conn_rec's bucket_brigade, and
> > the body will never get to go through the input_filters that were added
> > after reading the request.
> What about the request body (and maybe even the header of the next
> request) in the same buffer read by CORE_IN (or SSL)?
>   It doesn't matter that the request body is in the same packet as the
>   request headers.  CORE_IN and getline() have to work together so that
>   any data after the header is made available to CORE_IN for the next
>   time it is called.
>   getline() is called repeatably to grab the request line and header
>   fields.  It only cares about the conn_rec filters.  When it gets to
>   the end of the header any unparsed data must be chained back on the
>   conn_rec for CORE_IN (or SSL) to access again.

All of this assumes that CORE_IN has some understanding of HTTP, it
doesn't, and it shouldn't.  The whole point of CORE_IN, is that it only
understands reading from the network.

The problem is that when the CORE_IN filter reads a block, it does just
that, it reads a block of data, and it doesn't understand anything about
the format of the input.  It then passes this input back up to the
previous filter.  In this case, the previous filter is getline.  Getline
takes that data and parses into one line chunks.  If there is more than
one line, then getline stores that extra data in c->input_data until it is
called again.  That's it.  That's all the filters there are.

When ap_get_client_block is called, the data has already been
filtered.  It can't be filtered again, that's not a good idea, we can't
garuantee that the filters will allow the same data to be filtered twice.

The only solution that would allow for request input filters, is to have a
filter above the core filter that understands HTTP, and knows how to break
apart the headers from the body and from the next set of headers.  That
filter would have to do it's own buffering of the input data, just passing
up a single block of data at a time.  That filter hasn't been written yet,
and I haven't really given it any thought at all.

Regardless, all filters must be on the filter stack before the first input
filter is called.  Adding an input filter while in the middle of filtering
the input will not cause the new filter to be called.

> If/when ap_get_client_block is called by a handler...
>   call ap_get_brigade(r->input_filters) instead of ap_bread()
>   If no request-specific filters were added, r->input_filters is
>   CORE_IN.
>   If request-specific filters were added, r->input_filters is a not
>   CORE_IN, but eventually CORE_IN will be called to deliver data from
>   the client (possibly already read and stored by getline() or
>   CORE_IN in the conn_rec).

CORE_IN does not get data from the conn_rec.  It can't.  That data would
have been filtered twice.

> What about the end of the request body?
>   The end of the request body is signaled by one of three things
>   (AFAIK :) ):
>   a) end of connection
>   b) client provided Content-Length field in header and we've seen
>      that many bytes
>   c) chunked encoding was used and we've hit the trailing chunk header
>   Somehow the request body filters need to get EOS (or equivalent)
>   when one of these happens.

That is not true.  BUFF didn't have an EOS concept.  The input filters
will end up working the same way.  The client will get a block of data,
and they will do with it as they please.

>   Case a) is pretty simple...
>   Case b)...  Theoretically, the CORE_IN filter could take care of
>   this (know the content-length and how many bytes had been delivered
>   so far), but CORE_IN doesn't know anything about the r and isn't
>   supposed to know such details.  If we have content-length in the
>   header, we can throw in a filter between the request body filters
>   and CORE_IN to insert EOS at the right point and make any extra data
>   returned by CORE_IN available to CORE_IN on the next pass.
>   Case c)...  The CHUNK_IN filter needs to return eos at the right
>   point (not hard), and make any extra data returned to it available
>   to CORE_IN on the next pass.
> > The more I look at our current filtering scheme, the more it looks like
> > input filters are only valid if they are added before the request is read
> > from the network.
> Why is that true of content filters?  

Because the data is passed back through all of the filters when it is
first read.  If we want to change that fine, but currently, what you hve
added doesn't make sense, and it won't work.

I can fix it later today or tomorrow if you would like me to.  It requires
a new filter.  Which I was going to add anyway.  The new filter would
solve two problems.  1)  The input would be parsed into headers and body
segments, which would be passed up to the higher filters when data was
requested.  2)  The /r/n issue would move down to that filter, because it
doesn't really belong in getline.


Ryan Bloom               
406 29th St.
San Francisco, CA 94131

View raw message