httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Stein <>
Subject the ap_r* thing (was: Re: Apache 2.0 beta STATUS)
Date Mon, 22 Jan 2001 11:36:45 GMT
On Sun, Jan 21, 2001 at 11:56:55PM -0600, William A. Rowe, Jr. wrote:
> > From: Greg Stein []
> > Sent: Sunday, January 21, 2001 11:26 PM
> > 
> > On Sun, Jan 21, 2001 at 09:58:20PM -0600, William A. Rowe, Jr. wrote:
> > > As I say, I'm for the module author controlling the unknowns with as little
> > > possible chance for interference from 'invisible forces'.
> > 
> > The invisible forces in this case is the requirements imposed by forcing a
> > module author and even the core Apache code to have to synchronize between
> > two totally different output mechanisms. Each coder must be aware of all the
> > other bits of code which could potentially produce output.  Even worse, they
> > need to know *how* that other code did it. Not just "did they?" but "how?".
> > That kind of invisible coupling across the code base is a recipe for bugs.
> > Insert a sync call here, insert one there. Oops, we forgot to sync the two
> > mechanism over there, too. Oy! :-)
> How is this possible unless we invoke a subrequest?

You've got a DAV plugin. You are implementing the MERGE method. Within its
response definition is a way to return property information. So your code
calls dav_send_propstat().

Now, without looking, how can you tell what output mechanism was used? Was
it ap_r* or was it ap_pass_brigade()? If it doesn't match what I'm using,
then I'm going to need to do some extra work.

[ the mixing problem only occurs with Ryan's patch, so I need to use it here
  as a demonstration of the mixing problem; not to pick on the patch ]

To clarify:

*) assume it used ap_r*. There will be content sitting in r->bb. I construct
   my brigade for other pieces of the MERGE response and call
   ap_pass_brigade(). I've now misordered the response because my brigade
   went before the stuff sitting in r->bb.

*) let's say that I send some data with ap_r* and then call the DAV utility
   function. assume it uses ap_pass_brigade(). It constructs the brigade and
   passes it. Again: a misorder occurs because the brigade from the utility
   went out before the r->bb brigade.

An answer is to tell everybody "you must use r->bb, and never your own
brigade." That is taking away choicse, rather than providing them. Also
recognize that anybody using r->bb must first create it, if it doesn't
exist. Just a little bit more pain.

Oh, and if people are supposed to use r->bb, then they better make sure to
call apr_brigade_flush() on it (watch out: only if it has been created!) to
make sure that if somebody *has* put data in there, that it gets
synchronized properly.

By using a filter in my patch, I've trapped all output to the network.
Nothing can go without passing through that filter. Therefore, I have a
perfect choke point to ensure that I can order all the output properly.

> I'll think on this another night.  And [sorry I already clipped the text],
> on the comment of 'fixing apache' v.s. 'fixing apr', I'm very strongly
> against leaving useful mechanisms in apache when an equivilant mechanism
> exists to migrate them cleanly into apr.  If it were a case of "here's the
> Apache patch, and we'll finish moving it to apr in three weeks", I'd say
> great!  Let's roll the beta and move it afterwards!  But we aren't - again,
> I'm seeing two methods that would both solve our problem, with different
> requirements on the authors.

My patch has *zero* requirements on module authors. Use whatever API you
have been using or want to use. It doesn't matter, and you don't ever have
to worry about what somebody else is using. There is no possibility for
synchronization problems.

I'm all for creating an APR solution, but the number one priority is Apache.
If our solution also happens to work for APR, then great. But I don't
believe that we can necessarily say "this works for APR" and then wedge it
into Apache and make module authors need to be aware of the various output
mechanisms and compensate for potential ordering problems.

> Please, Greg and Ryan, document -simply- where your patch stands up and
> the other falls down, so I can grok this quickly without wadeing chin
> deep in the filtering mechanics!

All right. Here are a few cases:

*) ordering skew. see above.

*) ap_rwrite(my_buffer, 100000, r)

   gstein: flush the existing buffer (in the old_write filter), wrap a
           transient bucket around my_buffer, and deliver the brigade

   rbb: crash due to a memory overwrite in apr_brigade_write() -- it would
        copy the 100k my_buffer into a 9k malloc buffer.
	[ assuming we fix that bug... ]

	apr_brigade_write() has nowhere to send the data, so it allocates
	and copies the 100k into a new HEAP bucket, appends it to the
	brigade (r->bb), and then returns.
	eek! 100k memory copy!
	[ assuming that we patch up ap_rwrite to deal with "big" strings ]
	okay, now ap_rwrite() appends a 100k transient bucket to r->bb and
        passes that along.
	[ okay, now we have to fix this for ap_rputs and ap_rvputs, too.
	  darn. those are Apache specific now, huh? ]

*) ap_rprintf(r, "<D:compare-report>%s</D:compare-report>", report)
   report == 200k string

   [ assume both patches use apr_vformatter to break their 4k limit on
     ap_[v]rprintf ]

   gstein: the filter's buffer is used as the formatting buffer, so the
	   formatting begins after any existing buffered output.
	   apr_vformatter calls the flush_func after it fills the formatting
	   buffer. this flush_func is simply flush_buffer() in my patch. if
	   apr_vformatter exits without calling flush_func, then we have new
	   content sitting in the filter's output buffer.

   rbb: the flush_func is called and a new bucket is created from the
	formatting buffer and inserted into the brigade. There is no way to
	send this data to the network from within apr_brigade_vprintf().

	[ okay... restructure the assumption: apr_brigade_vprintf is torched
	  because it can hose the working set. move the vformatter call up
	  to ap_vrprintf. ]

        ap_vrprintf now uses a flush_func that can go to the network. but
	wait, we have some code in there to say "oh, but this is too small,
	so let's put this part into the brigade." The patch now has a dual
	path to decide where to send the data: in a brigade and down the
	network, or a call to apr_brigade_write() (which makes another copy)
	[ note: the above is now Apache specific ]

What I'd really like to see is a variation of Ryan's patch that uses a
bucket at the tail which is over-allocated so that more data can be
appended. That would allow bucket/brigade users to "write a bunch of little
items", yet still work with buckets. This would also be helpful for other
APR users. However, it doesn't really solve the ap_r* problem like my patch
does (and vice versa: I don't solve the "little bitty bucket output"
problem, nor the other-APR-user problem).


Greg Stein,

View raw message