httpd-apreq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Issac Goldstand" <mar...@beamartyr.net>
Subject Re: dev question: apreq 2 as a filter?
Date Sun, 25 Aug 2002 07:57:38 GMT

----- Original Message -----
From: "William A. Rowe, Jr." <wrowe@rowe-clan.net>
To: "Joe Schaefer" <joe@sunstarsys.com>
Cc: "Issac Goldstand" <margol@beamartyr.net>; "apreq list"
<apreq-dev@httpd.apache.org>
Sent: Sunday, August 25, 2002 9:02 AM
Subject: Re: dev question: apreq 2 as a filter?


> At 07:20 PM 8/24/2002, you wrote:
> >...
> >That is *exactly* what I'm saying; I think part of the confusion we're
> >having centers around *when* the apreq filter gets installed.  The
> >content-handler needs the ability to inject our apreq filter at runtime.
> >IMO, the injection should take place in the apreq_request_new
> >call, and the content-handler wants to call ap_get_brigade, it should
> >do it between apreq_request_new() and apreq_request_parse().
>
> Hmmm.  I'm suggesting we constantly parse the client input body
> in the same manner (once we have injected the apreq filter) without
> pausing for more data.  Until the entire body has been ap_get_brigade()'d
> the results are somehow tagged 'incomplete'.
>
> In other words, fill up those variables that parse 'complete', set aside
> the incomplete chunks, and continue to parse on the next brigade read from
> where we left off.

I really like this method.  I'm going to make one huge reply to this, and
other points, below...

> >I think Stas is arguing that the apreq filter could be injected
> >later on, perhaps inside the apreq_request_parse call, but I think
> >that makes things too complicated.
>
> It cannot be injected after ap_get_brigade, and I see no case where
> it ever should be.  It must be injected beforehand, and we have any
> number of places that would be a good idea (such as the beginning
> of the handler hook, or the insert_filter hook, or any hook prior to
that.)
>

I agree, but have a slightly different approach on how to accomplish this.
(See below :))

> To allow filters to 'peek' at the first 8+k of body before we hit the
> handler phase is also good, but I wouldn't get to the point that we
> slurp the entire POST body, only what might be necessary (the first
> few posted variables) before we hit the handler.
>
> In any case, we can never destroy the POSTed data in the input
> stack, so we need to set some arbitrary and sane limit on how much
> data can be pre-fetched before we hit the handlers, which ultimately
> are the final consumers.


Let me restate this whole process, as I see it happening, using the
warehouse lingo that we were using before.  I believe that we're on the same
wavelength here, but want to make sure...  I see three major components
here:  The filter, the parser, and the "warehouse manager"...

(1) First of all, the only real way that apreq should be installed is as an
input filter.  (2) The filter should be installed as early as possible and
(3) immediately create an empty data structure in memory - I'm not going to
say where (notes table should be fine if it's still there in Apache2),
because that's probably an entire conversation on its own.  In any case,
user intervention SHOULD take place as early as possible in Apache's
request-phase chain as possible.  ((4)Frankly, we may find useful to provide
httpd.conf directives to enable users to somewhat tweak the necessary
configurations, and provide a handler that runs as early as possible to scan
directives for each location before it starts.  The action taken would This
should include a directive to *uninstall* (or disable, or whatever) the
apreq filter, too).
Moving along, (5) mechanisms to override the default input method, which is
input and share with other filters, should be provided, and can be invoked
anywhere up until the first call to ap_get_brigade.  Frankly, it ought to
work afterwards also, but then we run the risk of other filters choking on
mangled data.  We wouldn't want another filter to do that to us, so we
oughtn't do it to them.  In any case, (6) the configuration for the
request-specific parameters of the apreq call should be read during the
first callback of the actual filter (eg, first time ap_get_brigade is called
from anywhere).  (7)At that point, a flag is set in our little
request-specific apreq notepad to tell us that we've started munching data,
and that (7a) requests for behavior changes for the request-specific apreq
call should fail (not silently - it should return failure status to user -
possibly with a reason) and (7b) the warehouse doors are now open, but the
warehouse is flagged as being "stocking in progress" (the warehouse should
most likely NOT be in the same place as the configuration directives - the
former potentially needs lots fo room, while latter doesn't).  (8)If the
"exclusive mode" flag is set for this request (file upload? It doesn't
matter - what matters is that this is the Apache 1.x style apreq that
everyone's so keen on having in apreq2), then we simply don't pass the
brigade on to the next filter, unless, of course, it's EOS.  (9)Also at this
point (this is still *first* ap_get_brigade call only), we check to see if
the "populate-at-once" flag is set for this request.   We can have a
mechanism where we continuously call ap_get_brigade until we hit EOS to do
this.  Note that the "populate-at-once" and "exclusive" modes can thus run
independantly of one-another. (10) Lastly, once EOS is recieved, we mark the
warehouse as "warehouse full" in the request-specific configuration notepad.
 What remains is the warehouse manager.  (11)I think we need a 3-key system
to manage the warehouse entries: "Data/Name", "Value" and some flag (bit?)
"Status".  To do this, the parser would start populating entries in the
warehouse as it comes in (from the filter).  (12)As soon as each entry is
completed in the warehouse, the status flag should be set to indicate
"in-stock".  (13)An entry in the per-request configuration "notepad" can
contain the name of the current "item" being imported into the warehouse.
(14)Calls to get data from warehouse (this is the "warehouse manager" part)
should scan the warehouse entries.  (14a)If an item is "in-stock", no
additional data-collection is needed.  (14b) If an item is in, but not
flagged, we call ap_get_brigade until it's flagged "in-stock" by the parser
(ONLY the parser can import to the warehouse, whereas ONLY the warehouse
manager can actually read items from the warehouse).  (14c) If the data is
not found and the "warehouse full" flag is set, the call fails.  (14d)
Otherwise, we continue to call ap_get_brigade (either explicitly from the
parser, or implicitly by simply setting the "populate-at-once" flag and
calling ap_get_brigade once from the parser [I'd say explicit is better,
simply becuase it allows us to contiually check the warehouse for the
addition of our data and stop calling ap_get_brigade once our data is
"in-stock"].)  Once we hit "warehouse full" (note that the warehouse manager
doesn't care about EOS - all it cares about is "warehouse full") and haven't
found our data, the call fails.

I think that about covers the lifespan of an apreq call.  What do you people
think?

  Issac


Mime
View raw message