httpd-modules-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sorin Manolache <sor...@gmail.com>
Subject Re: Apache Buckets and Brigade
Date Wed, 01 May 2013 13:41:33 GMT
On 2013-05-01 14:54, Sindhi Sindhi wrote:
> Hello,
>
> Thanks a lot for providing answers to my earlier emails with subject
> "Apache C++ equivalent of javax.servlet.Filter". I really appreciate your
> help.
>
> I had another question. My requirement is something like this -
>
> I have a huge html file that I have copied into the Apache htdocs folder.
> In my C++ Apache module, I want to get this html file contents and
> remove/replace some strings.
>
> Say I have a HTML file that has the string "oldString" appearing 3 times in
> the file. My requirement is to replace "oldString" with the new string
> "newString". I have already written a C++ function that has a signature
> like this -
>
> char* processHTML(char* inHTMLString) {
> //
> char* newHTMLWithNewString = <code to replace all occurrences of
> "oldString" with "newString">
> return newHTMLWithNewString;
> }
>
> The above function does a lot more than just string replace, it has lot of
> business logic implemented and finally returns the new HTML string.
>
> I want to call processHTML() inside my C++ Apache module. As I know Apache
> maintains an internal data structure called Buckets and Brigades which
> actually contain the HTML file data. My question is, is the entire HTML
> file content (in my case the html file is huge) residing in a single
> bucket? Means, when I fetch one bucket at a time from a brigade, can I be
> sure that the entire HTML file data from <html> to </html> can be found in
> a single bucket? For ex. if my html file looks like this -
> <html>
> ..
> ..
> oldString
> ... oldString...........oldString..
> ..
> </html>
>
> When I iterate through all buckets of a brigade, will I find my entire HTML
> file content in a single bucket OR the HTML file content can be present in
> multiple buckets, say like this -
>
> case1:
> bucket-1 contents =
> "<html>
> ..
> ..
> oldString
> ... oldString...........oldString..
> ..
> </html>"
>
> case2:
> bucket-1 contents =
> "<html>
> ..
> ..
> oldStr"
>
> bucket-2 contents =
> "ing
> ... oldString...........oldString..
> ..
> </html>"
>
> If its case2, then the the function processHTML() I have written will not
> work because it searches for the entire string "oldString" and in case2
> "oldString" is found only partially.


Unfortunately there is no guarantee that the whole file is in one brigade.

Even if it was in one brigade, there is no guarantee that is it in a 
single bucket.

So you can have case2.

In my experience the buckets that I've seen have about 8 kilobytes. So 
you will not consume too much memory if you "flatten" the bucket brigade 
into one buffer and then perform the replacement in the buffer. (see 
apr_brigade_flatten). However, you have to provide for the case in which 
oldString is split between two brigades.

Sorin

>
> Thanks a lot.
>


Mime
View raw message