httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Trawick <>
Subject Re: [PATCH] dechunking, body filtering (fwd)
Date Mon, 16 Oct 2000 16:52:59 GMT writes:

> On 14 Oct 2000, Jeff Trawick wrote:
> > writes:
> >
> > > You can't do this.  You can't even garauntee that http_filter will be in
> > > the server.  I have every intention of writing an input filter that takes
> > > ftp requests and converts them to http.  When I do that, there will be no
> > > http_filter in the request, and this will break.  No filter can EVER
> > > depend on another filter being in the stack.
> > 
> > You could put such a filter below HTTP_IN.
> > 
> > There are other cases where code outside of HTTP_IN will need to know
> > what it is up to.
> > 
> > Example:
> > 
> >   we're in some buffering logic on output...  we need to know whether
> >   there is more data already received on input (the next request)
> >   before we decide to sent a small amount of data...  HTTP_IN would be
> >   holding onto such data if it has been received...  we need to take a
> >   look in its context
> This is a bogus example.  We don't do that.  

Go look at ap_bhalfduplex().

> > One part of Apache can certainly depend on another part of Apache
> > being there.  Filters don't change this.  We can require that *our*
> > code be at that level to properly enforce certain parts of HTTP.  We
> > don't allow modules to replace just anything.
> But http_filter doesn't need to be required, and making it required
> actually limits what other input filters we can write, or at least makes
> it a bit more annoying.  Imagine a filter that just creates it's own
> request (I have already seen this filter in action BTW).  Now, we have to
> have the http_filter in the filter chain, even though my module
> understands exactly where it is in the protocol, and how to deal with the
> data.
> The problem is that http_filter has the wrong name now.  When it was
> started, it was going to really understand the http protocol.  Now, it is
> just a filter that understands how and when to unbuffer data.
> > > >   r
> > > >  
> > > >     Add "void *filtered_input" field; ap_get_client_block() holds onto

> > > >     leftover body in this field.  Ugly that we clutter the r...  Ugly
> > > >     I made it "void *" instead of "ap_bucket_brigade *" (avoids an additional
> > > >     #include in "httpd.h" for (probably) no good reason).
> > > 
> > > This was discussed yesterday and people really disliked it.  If filters
> > > are told the maximum amount of data they can return, this is
> > > unnecessary.
> > 
> > Some people really disliked it, sure...  I knew that and went ahead
> > anyway because still I think that the length parameter has some issues
> > with it.
> > 
> > . The length parameter means that filter B can ask filter C to return
> >   an amount of data which is convenient for filter B.  Any benefit to
> >   a such a filter goes away because filter B then has to split at the
> >   point convenient to filter A.  It is much simpler when filters can
> >   return the amount of data they find appropriate. 
> Um, that is the same way that the read function works, the caller tells
> the operating system how much data it can comfortably handle, and the
> operating system respects that.

So what?

> > . ap_get_client_block() telling the next filter that it can only
> >   accept 8192 bytes breaks certain cases like a pipe bucket.  Do we
> >   need to implement some sort of filter to sit below
> >   ap_get_client_block() to get everything in memory and return only
> >   the desired amount of data?  All this to avoid a field in the r?
> Ummm, since there is no filter that returns a pipe bucket on input this
> too is a bogus example.  We aren't talking about adding a new filter.  If
> your filter creates a pipe bucket, then yes it is your filters
> responsability to make sure we don't return too much data.

Not nearly as bogus as your ftp->http conversion filter :)  A real
life example: mod_ext_filter in output-filter mode needs to be changed
to construct a pipe filter for output once we know we have delivered
the child process all the data to filter.  The same should happen with
the input side. 

> > . The fact that ap_get_client_block() deals with filtered byte counts
> >   but HTTP_IN deals with raw byte counts is also messy.  How does this
> >   work, anyway? 
> ap_get_client_block doesn't deal with byte counts at all.  Yes, it still
> keeps track of them, but it doesn't actually use that
> information.  ap_setup_client_block tells http_filter how much data needs
> to be read from the network, this is done through the conn_rec, which is
> messy, and I tried to avoid it for 8 hours.  

Hey, I'm happy to set a field in c instead of HTTP_IN's context data.
The only reason I put it in the ctx is that I thought you were gung ho
on leaving the c alone.  Sometimes we aren't so religious when it
comes time to arrange the bits to actually accomplish something.

Suppose I change the way we tell HTTP_IN (or whichever hypothetical
filter provides the same service) how many body bytes are left so that
it matches your change from this weekend...

The remaining issue is that I have ap_get_client_block() save state
between calls.  In exchange, all input filters are simplified in that
they don't care how many bytes the caller wants; they do what is
reasonable (and the pipe bucket *is* a reasonable example; other folks
will need buckets of indeterminate length too).  

Jeff Trawick | | PGP public key at web site:
          Born in Roswell... married an alien...

View raw message