httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dorian taylor <dorian.taylor.li...@gmail.com>
Subject Wanted: Orientation on interaction between filters and subrequests
Date Mon, 22 Jul 2013 06:12:25 GMT
Hello,

Apologies in advance for the potential repeat; I initially posted to
modules-dev but realized this is more of an internals problem.

I'm trying to write a module that does what one would intuitively
expect from the following mod_rewrite incantation:

    RewriteCond %{REQUEST_URI} !-U
    RewriteRule (.*) http://another.host$1 [P,NS]

In other words, "If the resource (powered by an arbitrary back-end)
can't be found on this server, reverse-proxy the request to
another.host." —which due to its design, can't be done by mod_rewrite.
What I want this functionality for is a sort of "scaffolding" through
which a website can be replaced incrementally, resource by resource,
without having to know anything about the middleware(s) of either the
old or the new server (see
http://doriantaylor.com/the-redesign-dissolved for context).

I'm hoping somebody can spell out for me, or point me in the direction
of resources on how to better understand the design of the bucket
brigade I/O and how it interacts with subrequests.

Here is my strategy so far:

1) Start with a fixup handler rigged to run only on main requests
2) The fixup handler performs a subrequest on the same URI as the main request
3) If the lookup response is 404, set the response handler to
proxy-handler and r->filename to "proxy:..." and return OK
4) Otherwise, attach an output filter to the subrequest that (somehow)
blocks writes to the network and run it
5) Repeat step 3.
6) If we're still here, set the response handler to a dummy handler
with its own output filter that unblocks the output and return OK.

(Note, for simplicity's sake I am leaving out dealing with request
bodies for the moment, but the plan is tentatively to 'tee' them to a
temporary file and then replay them into the proxy request if
applicable.)

(I should also note that unless there is a particularly compelling
alternative I have overlooked, I have good reason to be confident that
this conditional reverse proxy should manifest as an Apache module.)

Where I'm fuzzy is the "somehow" of "pausing" the output to the
network from the subrequest and resuming it in the response handler in
the main request. I have a prototype which currently runs the local
request twice: first in the subrequest which I discard, and again
(notwithstanding being diverted to the reverse proxy) in the main
request which I leave untouched (recall, I am performing this
operation in the fixup phase). Inefficiency aside, this solution is
inadequate because of its potential to corrupt application state in
dynamic resources. And of course, the only reason why I have to run
the subrequest at all, is because it appears the 404 status is more
often than not set by a module's response handler.

(I am also not clear about what happens under the hood with respect to
the subrequest's header set/protocol data, and whether or I have to
manually pull it up into the main request.)

I considered putting an EOS bucket at the front of the subrequest's
output brigade, and then popping it off in an output filter in the
main request, but I'm not sure how nasty a hack that would be and/or
what side effects might arise from a trick like that. Alternatively, I
suppose I could write the subrequest's content to another temporary
file and then just replay it in the response handler, but I'd prefer
to avoid creating any unnecessary I/O.

Thanks in advance for any orientation, existing code that behaves
similarly, or any other advice anybody can share.

Regards,

--
Dorian Taylor
http://doriantaylor.com/

Mime
View raw message