httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "William A. Rowe, Jr." <wr...@lnd.com>
Subject RE: summary: issues for my filtering patch
Date Wed, 28 Jun 2000 19:09:50 GMT
> From: rbb@covalent.net [mailto:rbb@covalent.net]
> Sent: Wednesday, June 28, 2000 11:17 AM
> 
> > If a filter is going to need to possibly modify a header based on
> > content, I don't see any way that filter can avoid buffering the
> > data. Unless the filter simply passes a flag on to a lower level
> 
> Exactly!  The filter can't avoid it.  So, then the question is how does
> the filter buffer the data.  There are two answers to this:
> 
> 1)  The filter just comes up with it's own method for filtering, and if
> that means keeping the whole request in memory, then so be it.

If this happens, and no filter -wants- to manipulate the output, the
CGI's behavior may be async, and we can start showing users the results
of their inquiry immediately (or at least the table headers for data
to come in 15 seconds or so.)
 
> 2)  The core imposes some sane restrictions on how much of the request is
> in memory by controlling the buffering.

Then it sounds like we can't see async behavior?  Unless some filter really
-must- hold the headers until the entire data stream is cached, then this
is sounds like a very bad behavior.  Who -must- hold headers until the
entire body is constructed?  Perhaps my own mod_robots, and it only cares
about reparsing the <HTML><HEAD>[<META ...>...]</HEAD> stream, so
even it 
should release the request fairly quickly.  

Few filters should even need to do what this suggests.  I agree the server 
aught to become responsible and prohibit really bad memory behavior by 
laying out the rules and mechanisims, but I strongly believe either patch
can and should provide the api to address this issue.  But the abuser of
this feature should be the one to jump through hoops, not the rest of the
module writing world :)

> Greg is proposing #1, I am proposing #2 (after much prompting and head
> beating from Roy).  My feeling, after having looked at this 
> problem for a
> VERY long time, is that trying to just fit option 2 in after 
> the current
> patch is applied is almost impossible.
> 
> Using option #1 is not a good idea IMNSHO, because it allows stupid
> filters to take down the server.
> 
> > that says "hey, here's the data to send, but don't send it until
> > I tell you its okay, because I'm going to be adding a header". But
> > all you've gained in that case is the ability for the later layers
> > to start processing the data instead of having to wait for the
> > layer that thinks it needs to see all of its input before it can
> > send on its output. (Which it will presumably do by sending all
> > the accumulated data at once, so each later layer gets called
> > once and doesn't have to do any buffering.)
> 
> By having a top level filter do the buffering it requires, we 
> can actually
> get single or zero copy in.
> 
> > But it also just seems like an optimization. Like I said, PHP
> > didn't have any sort of output buffering until the latest version,
> > and although there was some grousing from users, they generally
> > accepted the limitation of not being able to add headers after
> > non-header data had been sent.
> 
> Again, it is how the buffering is done that is important.  Asking each
> module to implement their own filtering is asking for trouble.
> 
> Ryan
> 
> ______________________________________________________________
> _________________
> Ryan Bloom                        	rbb@apache.org
> 406 29th St.
> San Francisco, CA 94131
> --------------------------------------------------------------
> -----------------
> 

Mime
View raw message