httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject Re: Filter I/O take 3.
Date Sun, 18 Jun 2000 14:25:30 GMT

> > A couple of things.  Don't do this!  By you making these changes now, you
> > break a patch that is being considered.
> 1) There are common elements between both approaches. I believe we should
>    get that stuff ironed out first. That will make the "meat" of the
>    filtering much smaller and reviewable.

But in committing this, you broke a patch that has been submitted for
review and you did it intentionally.  It doesn't matter if you veto'ed the
patch or not.  Right now, nobody else can even review the patch to form
their own opinions.  Which means, you and I are really the only two who
can easily debate this patch, because anybody who tries to apply it now
just gets errors.  That is a very bad thing.

>    It also means that the non-meat portions can be effectively reviewed. For
>    example, your change from ap_rvputs() to ap_bvputs() was incorrect. I saw
>    it because I'm reviewing your filtering patches very closely. Did anybody
>    else? I seriously doubt it. It seems most people have tuned out :-(

I did.  In fact, I specifically mentioned that the headers stuff had to be
reviewed closer.  (If I didn't, I meant to, and I'm sorry.  I have other
e-mail that definately mentions it that I didn't send).  Just because
there are minor issues, doesn't mean  something is wrong, it's just not

>    We are in no rush to get the "meat" checked in, so it seems eminently
>    better to work on getting the fundamentals completed first. Specifically,
>    that means changing ap_rwrite types of calls, which write non-content,
>    over to direct buffer calls.

The approach to the fundamentals can have an effect on the "meat".  Please
don't make any more changes like this until we have decided on an
approach.  For example, I wouldn't have made you change, because it
doesn't make any sense in the hook scheme.

> 2) "considered" isn't entirely true, as the patch has been vetoed. Until
>    some basic changes occur, I don't see that veto changing.

It is true.  Whether you have vetoed it or not, it is being
considered.  And, it will continue to be considered until you or somebody
else post a new patch.  This is because other people can review the patch
and help to convince either you or me that we are wrong.  One veto does
not mean the patch has been removed from consideration.  It means one
person is stopping it from being committed.  Those are VERY different

> > 1)  Modules must maintain their own stat for the data they haven't dealt
> > with yet.
> In the link scheme, each ap_layer_t structure has a context pointer. This
> allows a layer to store arbitrary context about where they are.

But it is up to the module to maintain this state.

> These context pointers are analogous to the r->unfiltered[] concept that you
> introduced in your latest patch. You alloc one per layer; I put them right
> in with the layer. The unfiltered[] thing is specific to "unprocessed input"
> while the context pointer is arbitrary (and may include unprocessed input
> along with other values, such as the parsing state of a finite automata that
> is operating within the filter).

The difference is who maintains the state.  You make the module do it.  I
make the core do it.  If the module does it, then the module has to
implement it's own iovec's.  If the core does it, then the iovec's have
one implementation, and the modules only have deal with iovec's, not
iovec's and regular strings.

> IMO, associating the state in the ap_layer_t is more direct than a secondary
> array. I'm also a bit uneasy about keeping the two in sync (although it is
> quite unlikely that a filter would be added/removed/re-ordered after the
> first content byte goes out), and that it relies on the .nelts field which
> seems rather private to the hook implementation.

The reliance on nelts will be fixed as soon as their is a
ap_get_nelts() function for tables.  If a request re-orders the layers, it
has to make sure that all data has been flushed from the layers that are
being re-ordered otherwise the results will be very strange.  If the data
is empty, re-ordering is simple in the hook scheme.

> I'm unaware of any particular issues with storing state in the ap_layer_t.
> It doesn't seem like the considerations mentioned just above apply; please
> correct me otherwise.

Take mod_foo, which has saved in it's layer context the string "waiting to
be dealt with".  Now, the next time the filter is called (passed the data 
"just arrived to the filter", with the link scheme, the module has to
parse "waiting to be dealt with" and "just arrived to the filter".  This
is an iovec.  Why should a module writer have to deal with both regular
strings and iovecs?  Even if you don't use iovec's, you have to mimic
them.  Take this instance:

first call to filter:   <!--
second call:            #set
third call:             foo=bar -->

The module has just been forced to implement iovec's.  :-)

> Eh? This is the entire design of the BUFF code. Accumulate data into a
> pre-alloc'd buffer. When it gets large enough, dump it to the network. If
> the input buffer is large enough, then drop it straight to the network.
> I disagree with your characterization about the two schemes. I look at your
> mod_gargle and see a bunch of small buffers created, and each is dumped to
> ap_bwrite() in the filter_io() function.

This is because it hasn't been optimized.  Do not take non-optimized code
for the final thing.  It is VERY easy to re-write bwrite (or create
ap_bvwrite) to sum up the amount of data passed and use that value as the

> As you make it easier to create ioblocks (e.g. ap_ioblock_add("string")),
> then people will start appending little bits to the ioqueue. Same situation
> as the link-based scheme. Any coalescing that can be done in the hook scheme
> would duplicate what BUFF is already doing. IOW, even if one of the schemes
> "can" coalesce blocks, it is a moot point.

See above.

> Neither filtering scheme is going to encourage any particular buffer size.
> They'll be big or small, many or few, as the module writer desires. I don't
> see this (2) as an actionable "issue" with the link-based scheme.

See above.

> > 3)  Modules must filter data without any context about what else is going
> > on around the data they are directly working on.  Unless of course they
> > maintain that data themselves.  There is an obvious change to the hook
> > scheme that allows us to send a minimum amount of data to a hook at any
> > one time.  This cannot be done for the link based scheme (at least not
> > cleanly IMHO.)
> I am unsure what you are saying here. It seems like you're suggesting that
> we buffer up data and pass larger chunks to the filters. That seems possible
> with either scheme.
> Am I missing something here?

Yes, in the link-based scheme, the module has a small piece of data to
work with, which provides no context for the module.  If you start to
buffer stuff up in the link-based scheme, you are using iovec's to do
it.  This removes your objection to the complexity of the iovec's.  BTW, I
believe you have just described Roy's bucket brigades.

> Common work that still needs to be done:
> - add the "insert_filters" hook
> - some ap_rputs() calls switched to use an internal function (which looks
>   exactly like today's ap_rputs); later, we'll switch ap_rputs() to filter
>   content, which won't affect the non-content writes.
> - automatically call ap_send_http_header() on the first output, or when the
>   request terminates with an empty-body response such as 204
> - add/use trailer_hdrs to request_rec for mod_auth_digest
> - create a flush_filters() to be called from ap_finalize_request_protocol();
>   will be filled in later.
>   (this is also where the trailer_hdrs are sent out)

I am -1 veto'ing any changes to this code until a decision is made on the
approach.  Breaking patches that are currently being reviewed is a VERY
bad precedence to set.  The patch that has already been committed makes it
impossible for anybody new to review the patch.  Adding more to that is a
very bad idea.


Ryan Bloom               
406 29th St.
San Francisco, CA 94131

View raw message