httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Geoffrey Young <ge...@modperlcookbook.org>
Subject Re: The Byterange filter -- a new design -- feel free to rip it to shreds
Date Tue, 13 Jul 2004 15:46:27 GMT


Graham Leggett wrote:
> Geoffrey Young wrote:
> 
>> please take the rest of this as just a friendly discussion - I don't
>> want it
>> to turn into some kind of bickering match, since that's definitely not
>> what
>> I have in mind :)
> 
> 
> Cool no problem - it's quite a complex thing this, and I was struggling
> trying to make it clear what exactly needed to be done and where (and why).

:)


> In this case, we're not just feeding content up the stack, but content
> _and_ HTTP headers. Filters cannot ignore the headers, otherwise broken
> behaviour is the result. 

hmm, yeah I see it now.

> A classic example is a filter that changes the
> length of the content (mod_gzip, or mod_include). These filters need to
> concern themselves with the HTTP Content-Length header, otherwise a
> response from mod_proxy going up the stack could get shipped to the
> browser with the wrong Content-Length.

ok.  but the difference I think I see here between C-L and Range is that if
you are a content-length altering filter, you know it - removing C-L is
required because you are the one doing the altering.

if I understand things right, Range is slightly different.  in this case it
seems like every filter would need to ask "are we byteserving" in order to
know what to do about Range.

> 
> In most cases for filters handling the headers is trivial. mod_gzip
> might strip off a Content-Length header in the hope that a filter might
> chunk the response down the line. mod_include should (in the most simple
> case) strip off any Range headers in the request in the hope that the
> byte range filter handles the range request down the line.

but all of a sudden it's not just mod_include, but every similar output
filter, right?  as in all those that API users will be writing.

> 
> But in the case of mod_proxy, mod_jk, etc it is quite valid and very
> desirable for a range request to be passed all the way to the backend,
> in the hope that the backend sends just that range back to mod_proxy,
> which in turn sends it up a filter stack that isn't going to fall over
> because it received a 206 Partial Content response.

yes, I can see that.

part of the trouble I find myself in here is that I see some kind of creep
going on.  right now filters need to handle Content-Length under some
circumstances, and you are suggesting Range as well.  both of these remove
part of the filter abstraction and replace that part with (a few) special
cases.  how many special cases will be required in the end under the current
design, and how much complexity does that add for filter writers?

so perhaps we need to be dealing less with Range specifically and more with
a second-generation filter design that addresses some of these outstanding
issues.  for instance, perhaps designate some kind of filter class system,
whereby content-altering filters register themselves differently than
pass-thru-type filters (as I'll call the current proxy issue I guess), etc?

> 
>> that's true if I'm wrong about the assumption above.  but in my mind, the
>> filter API is the most useful if content handlers (and content-altering
>> filters) can remain ignorant of 206 responses and the byterange filter
>> can
>> bat cleanup.
> 
> 
> For simplicity case the above is a noble goal - but one with some
> significant performance drawbacks in many real world applications.

indeed.  the trouble is that by streamlining for the performance of some
applications you (possibly) limit (or add sufficient complexity to) the
ability of other applications to do other, equally important things.

at least if the API isn't sufficiently worked out from all angles :)

> The problem though is not with the content handlers but with the filters
>  - filters must not make the assumption that all content handlers only
> serve content and not HTTP headers. 

good point.


> When a content handler decides that
> it wants to handle more of the HTTP spec so as to improve performance,
> it should be free to do so, and should not be stopped from doing so due
> to limitations in the output filters.

yes, that would be a good design goal.

> 
> In other words if mod_proxy is taught how to pass Range requests to the
> backend server, the output filter stack should not stop proxy from doing
> so by removing Range headers unless it is absolutely necessary.

ok, thanks for taking the time to explain it all, if only for me :)

do you think that the proxy-specific issue can be boiled down to something
more generic?  at least from here, it looks like what really needs to happen
is that certain headers need to have their origin and end-point known so
that filters know their place in the process and how to behave when they see
them.  is that kind of the issue, and can a user-level API be created around it?

--Geoff

Mime
View raw message