httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject Re: [PATCH] ap_add_filter
Date Sun, 20 Aug 2000 13:07:38 GMT

In a message dated 00-08-20 11:05:54 EDT, Ryan writes...

> Because filters can only add more
> filters immediately after themselves, they have an implicit sub_chain, and
> cannot muck with the filter stack in general. 

Yes... this is exactly what I suggested a few days ago ( the DOS
TSR IO chain example ) and it WILL work under MOST circumstances. 
It also achieves the same 'keep the filter in its own little world' goal that
I mentioned earlier and that Manoj is now talking about.

As Roy pointed out a few messages ago as well... you are also
automatically gaining the ability for a filter to add a pre-pass filter
both immediately 'before' and 'after' itself. Anything that can add
a hook after itself can always add something 'in front' of itself as
well by simply making the ( for lack of a better word ) 'main' code
thread consider itself 'second' in the chain instead of 'first'.

This is still exactly how DOS TSR I/O filtering works. Any TSR
can install a pass through any filtering code it has access to 
either 'before' or 'after' itself but the process for doing both is
the same. The interrupt is going to fire and the code pointed
to by the IVEC table is going to execute... but what happens
after that ( and in what order ) is completely up to the code sitting
at the jump address for that interrupt in the vector table.

Look at it this way...

Your standard HOOK code which knows that filters have been
registered with a ( single ) entry point at startup is the 'interrupt'
that's always going to fire. Apache WILL tap the filter on the
shoulder with a call to its entry point and this is, essentially, the
'interrupt' that fires for each and every request.

The filter's ( currently single ) entry point gets the 'tap on the
shoulder' and the chance to do anything it wants including
passing the data through its own self-established filtering
path. DOS and interrupt handling code never knew what was
happening and your base level hook/filtering call code never
needs to know, either.

It's simple, understandable, has a 30 year proven track record,
and in the right hands can provide a great deal of creative

However ( there's always a catch, isn't there! )...

Even if you are having trouble imagining scenarios where a piece
of filtering code might have to add some sort of 'pre-pass' that
sits 'in front' of some other mainline piece of filtering code in
the same module... the full support of the current design
thrust is going to very quickly focus on the issue of re-entrance.

What you are actually doing here is adding both full MPM support
and IO filtering into the same major revision of the software... but
you are trying to not require any major changes to modules that
are already written. No small task.

Here is a CONCRETE example of some VERY useful code that
could become a kick-ass Apache filter... but only if the things
I've mentioned are air-tight ( Ability to pre-pend and safe re-entrance ).

I already have a compilable interface for Apache that converts
graphics on the fly. It can convert ( and scale ) any Web graphic
into a size and a format that is compatible with any graphics-limited
user agent such as a PDA ( Palm Pilot ) or a WebPhone.

It was written more than a year ago and is, of course, not using
the new 'filtering' scheme but once added to Apache it does
exactly what you imagine a 'classic' filter doing.

1. It gets 'tapped on the shoulder' by the request processing
engine and, if the outound GET object is a graphics file, it
'installs a copy of the filter' and is going to get the chance
to filter the outbound data.

2. It looks at the request headers ( user-agent, mime-type, etc )
during the 'tap on the shoulder' and also makes decisions based
on those values about whether it should even 'kick in' and
what to do once it does.

3. Depending on what is about to happen for the current
transaction... it might need to install a pre-pass filter on
the graphics data that sits in front of the mainline code
which is going to scale and reformat the graphics data.

Why would a pre-pass ever be needed?

Simple... it uses the standard NetPBM graphics stuff
and that's exactly how it is actually written. NetPBM
does a 'pre-pass' on all input data which always converts
all graphics data to the .PBM ( Portable Bitmap ) format
before it does anything else.

Actually... If NetPBM is being used as a real-time graphics
processing engine then you end up with any number of
full 'passes' through the code depending on the output
you want. Sometimes it has to do the PBM conversion
first, then re-enter the code and do the scaling, then
re-enter the code and do the palette conversions, then
re-enter one more time and come up with the final 
output graphics format.

In other words... each NetPBM 'filtering' pass has to
fully complete before the next one can begin.

So, obviously, from a design standpoint... the best thing
to do here is simply consider all the passes that might
have to be performed on the data to be different 'filters'....
even though they will all have the same entry point.

Which of the 'pre-pass' or 'post-pass' filters have to 
be called is always dependent on what needs to happen
with the current object and the filter code itself is the
only thing that's going to know what needs to happen when
to produce the right final 'filtered' output.

Actually... ANY filtering scheme from any corner of the
earth that depends on full multiple passes through the
data ( think WORKFILES ) is going to need this safe
'clone thyself' ability to avoid total re-writes.


No matter what scheme is finally employed for allowing
a filter to add other filters either 'before' or 'after' itself...
if a filter doesn't at least have the abilty to 'clone' itself
and establish its OWN 'filtering chain' which is both
re-entrant and thread-safe then the design is going to get 
real inflexible real fast.

Have you thought about adding 'instance' handles?

This is standard stuff in these situations and
this is what any filter could use to simply clone itself
but KNOW which 'instance' of itself is now being 'called'
so it can keep its OWN filtering path straight.

Instance handles are no big deal but they do need to be
handled by the underlying code since only it knows which
'instance' of a cloned filter is being called when the entry
point is tapped on the shoulder.

Obviously the elimination of global variables in modules
that are filters and the complete knowledge of which 
MPM model is in use ( process or thread ) all comes into
play when you ask code to start cloning itself for different
tasks but that is really not Apache's concern... the onus
there is on the module/filter writer where it should be.

Kevin Kiley
CTO, Remote Communications, Inc.
http://www.rctp - Online Internet Content Compression Server

View raw message