httpd-modules-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joshua Marantz <jmara...@google.com>
Subject Re: Apache Buckets and Brigade
Date Wed, 01 May 2013 15:34:12 GMT
I didn't know about mod_substitute or mod_sed  :)  The
ModPagespeedSubstitute command I proposed probably adds nothing to those.

But in any case that was not sufficient for Sindhi's use-case where he
needs to impose data-dependent business logic and not statically define a
substitution in a conf file.

-Josh


On Wed, May 1, 2013 at 11:19 AM, Jim Jagielski <jim@jagunet.com> wrote:

> How is that different from mod_substitute and/or mod_sed?
>
> On May 1, 2013, at 9:22 AM, Joshua Marantz <jmarantz@google.com> wrote:
>
> > I have a crazy idea for you.  Maybe this is overkill but this sounds like
> > it'd be natural to add to mod_pagespeed <http://modpagespeed.com> as a
> new
> > filter.
> >
> > Here's some code you might use as a template
> >
> >
> https://code.google.com/p/modpagespeed/source/browse/trunk/src/net/instaweb/rewriter/collapse_whitespace_filter.cc
> >
> > one thing we've thought of doing is providing a generic text-substitution
> > filter that would take strings in character-blocks and do arbitrary
> > substitutions in them, that could be specified in the .conf file:
> >  ModPagespeedSubstitute "oldString" "newString"
> >
> > You are right that text-blocks in Apache output filters can be split
> > arbitrarily across buckets, but mod_pagespeed takes care of that in an
> > HTML-centric way, breaking up blocks on html tokens. A block of
> free-format
> > text would be treated as a single atomic token independent of the
> structure
> > of the incoming bucket brigade.
> >
> > Let me know if you'd like to discuss this further.
> >
> > -Josh
> >
> >
> > On Wed, May 1, 2013 at 8:54 AM, Sindhi Sindhi <sindhi.for@gmail.com>
> wrote:
> >
> >> Hello,
> >>
> >> Thanks a lot for providing answers to my earlier emails with subject
> >> "Apache C++ equivalent of javax.servlet.Filter". I really appreciate
> your
> >> help.
> >>
> >> I had another question. My requirement is something like this -
> >>
> >> I have a huge html file that I have copied into the Apache htdocs
> folder.
> >> In my C++ Apache module, I want to get this html file contents and
> >> remove/replace some strings.
> >>
> >> Say I have a HTML file that has the string "oldString" appearing 3
> times in
> >> the file. My requirement is to replace "oldString" with the new string
> >> "newString". I have already written a C++ function that has a signature
> >> like this -
> >>
> >> char* processHTML(char* inHTMLString) {
> >> //
> >> char* newHTMLWithNewString = <code to replace all occurrences of
> >> "oldString" with "newString">
> >> return newHTMLWithNewString;
> >> }
> >>
> >> The above function does a lot more than just string replace, it has lot
> of
> >> business logic implemented and finally returns the new HTML string.
> >>
> >> I want to call processHTML() inside my C++ Apache module. As I know
> Apache
> >> maintains an internal data structure called Buckets and Brigades which
> >> actually contain the HTML file data. My question is, is the entire HTML
> >> file content (in my case the html file is huge) residing in a single
> >> bucket? Means, when I fetch one bucket at a time from a brigade, can I
> be
> >> sure that the entire HTML file data from <html> to </html> can be
found
> in
> >> a single bucket? For ex. if my html file looks like this -
> >> <html>
> >> ..
> >> ..
> >> oldString
> >> ... oldString...........oldString..
> >> ..
> >> </html>
> >>
> >> When I iterate through all buckets of a brigade, will I find my entire
> HTML
> >> file content in a single bucket OR the HTML file content can be present
> in
> >> multiple buckets, say like this -
> >>
> >> case1:
> >> bucket-1 contents =
> >> "<html>
> >> ..
> >> ..
> >> oldString
> >> ... oldString...........oldString..
> >> ..
> >> </html>"
> >>
> >> case2:
> >> bucket-1 contents =
> >> "<html>
> >> ..
> >> ..
> >> oldStr"
> >>
> >> bucket-2 contents =
> >> "ing
> >> ... oldString...........oldString..
> >> ..
> >> </html>"
> >>
> >> If its case2, then the the function processHTML() I have written will
> not
> >> work because it searches for the entire string "oldString" and in case2
> >> "oldString" is found only partially.
> >>
> >> Thanks a lot.
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message