httpd-apreq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Schaefer <...@sunstarsys.com>
Subject Re: dev question: apreq 2 as a filter?
Date Sun, 25 Aug 2002 14:12:31 GMT
"William A. Rowe, Jr." <wrowe@rowe-clan.net> writes:

[...]

> Let me be clear, I would strongly oppose (veto? dunno) the suggestion of
> two truly distinct mechansims for apreq to operate.  

So would I.

> Why?  Because we earn two places for lingering bugs and security holes
> instead of one.

Exactly right.

> As a filter, any filter and the final handler can share the apreq results.
> Even if we provide a 1.3-style lookalike, it should still be implemented
> in terms of the same filter we are discussing.

That's what I'm after also.

> >Joe, you seem to be seeing some third possibility which I must be missing,
> >as your comment doesn't fit either of these scenarios...  (Please don't
> >flame me too bad if I'm making some stupid error - this is the first time
> >I'm being brave enough to comment on anything Apache2 API related :-))
> 
> No, I read his suggestion as two courses, and suggest we narrow it down
> to the one truly generic solution.

I don't see why you're reading me that way.  I think the apreq filter
can/should operate in a completely transparent way, since all it has to 
do is read a copy of the buckets into the apreq_list _as the upstream_
_filters dictate_.  Every time our filter is invoked, it can make a
stab at parsing the apreq_list data, so the list should never get
very big.

Let me try to explain how we handle a file upload right now, and 
how I think we can do it using a filter.  I'm just trying to flesh
out a bit of the details so we can focus on where our viewpoints
diverge.

NOW: (apreq_list as a sink)

  1) apreq_request_new() sets up the parser stack and returns 
     the address of our warehouse.
  2) the content-handler does some intermediary work unrelated to
     apreq ...
  3) it now wants access to the warehouse; winds up calling 
     apreq_request_parse()
     a) the apreq_parser_mfd parser is engaged
     b) the mfd parser initializes itself from the request headers
     c) the parser asks apreq_list_read to locate a header block
        * apreq_list_read calls ap_get_brigade, asking for ~ 8KB
        * apreq_list_read flattens the resultant brigade into its list
        * apreq_list_read clears the brigade
        * apreq_list_read scans the list for a CRLF CRLF marker.
        * if it hasn't found one yet, it repeats the cycle.
     d) the parser parses the header block and determines that
        we're about to read a file upload.
     e) the parser enters a for(;;) loop, calling apreq_list_read
        until it returns 0 bytes.
        * apreq_list_read calls ap_get_brigade, asking for ~8KB 
        * apreq_list_read flattens the brigade into the list
        * apreq_list_read destroys the buckets in the brigade
        * apreq_list_read scans the list for an "end of data" marker.
        * apreq_list_read returns whatever its got so far 
          (up to the "end of data" marker).
        * the parser writes that returned data to a tempfile.
     f) goto (c), which exits the parser and sets the req->status = OK
        since there's no headers left to parse.

FILTER: (apreq_list as a ``non blocking'' pass-thru filter)

  1) apreq_request_new() injects the apreq filter, sets up the parser
     stack inside the filter, and returns the address of our warehouse.
  2) the content-handler does some intermediary work unrelated to
     apreq ...
  3) it makes a call to ap_get_brigade, which engages the apreq filter
     a) the filter engages the mfd parser
     b) the parser initializes itself from the request headers
    c1) the parser enters a MFD_HEADER state.
    c2) in MFD_HEADER state, the parser asks apreq_list_read to 
        locate a header block
        * apreq_list_read calls ap_get_brigade, asking for some 
          amount depending on the upstream filter
        * apreq_list_read flattens the resultant brigade into its list
        * apreq_list_read clears its brigade, but leaves the buckets alone
        * apreq_list_read scans the list for a CRLF CRLF marker.
        * if it hasn't found one yet, returns an "EWOULDBLOCK"
          condition to the parser.
    d1) on a successful return, the parser parses the headers
        from the list, and enters a MFD_DATA state.  Otherwise, 
        it returns control to the upstream parser here.
    d2) in MFD_DATA state, the parser determines we're about to
        read a file upload.
     e) the parser asks apreq_list_read to fetch a block of data:
        * apreq_list_read calls ap_get_brigade, asking for some 
          amount depending on the upstream filter
        * apreq_list_read flattens the resultant brigade into its list
        * apreq_list_read clears its brigade, but leaves the buckets alone
        * apreq_list_read scans the list for an "end of data" marker.
        * return whatever its got so far (up to the "end of data" marker).
        * the parser writes that return data to a temp file
    f1) if the list returned 0 bytes, goto (c1).  In this case,
        we'll need to reduce the amount of we ask for in c2

    f2) Otherwise return control to the upstream parser here.

In this scenario, the apreq filter never consumes more data than the
upstream filter requests.  Even if the file upload is huge, the
associated apreq_list will never get larger than ~ 32KB, and the
mfd parser will be writing the data blocks directly to disk.

I readily admit I still don't understand how ap_get_brigade
really works, and am still muddy about the relationship is between 
filters, buckets, and brigades, so some of my hopes for the
apeq filter may be somwehat naive.

How does your proposal differ?

-- 
Joe Schaefer

Mime
View raw message