httpd-modules-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Issac Goldstand <>
Subject Re: server side includes
Date Tue, 10 Apr 2007 19:26:09 GMT
Good point.  Actually, with Perl, I've found that it's simpler and
usually more convenient to dispatch buckets to HTML::Parser, which is
better at catching that sort of thing, and can also work in a stream,
chunk by chunk.  The stream-based interface provided by mod_perl really
makes this easy: while ($buf = $f->read) {$parser->parse($buf);}

You then have callbacks for opening tags, whitespace and closing tags,
which all default to $f->write($content), and you can add any custom
business logic above that.  I can provide some more fleshed out
pseudocode if anyone's interested.

The downside is that it locks you in to Perl, which I intentionally
wanted to avoid in this particular module.  And you still need to make
sure you don't get stuck with huge amounts of "leftover"s between
buckets (HTML::Parser will take care of remembering the leftover data as
a small bonus).

I don't see any perfect way to avoid the big common problems, though;
you always have to know what you're aiming the filters at and work


Nick Kew wrote:
> On Tue, 10 Apr 2007 09:20:22 +0300
> Issac Goldstand <> wrote:
>> $buf = ${$f->ctx}{leftover}.$buf if defined(${$f->ctx}{leftover});
>> (prepend f->ctx->leftover onto buf)
>> and anything leftover that doesn't include a full HTML tag goes to
>> ${$f->ctx}{leftover} = $buf || undef;
> Define "a full HTML tag".
> As in, for instance
> 	<img
> 	src = "arrow.gif"
> 	alt = " --> "
> 	>
> The point being, it's not a trivial task (and that's without
> putting things like the above in a comment or cdata section
> where its semantics are completely different, etc).

View raw message