httpd-modules-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ray Morris <supp...@bettercgi.com>
Subject flush or pass filter brigade to avoid memory exhaustion
Date Mon, 14 Nov 2011 19:00:07 GMT
I would appreciate some help with splitting and passing a brigade in 
an output filter, to avoid using memory proportional to the size of 
the response and allow data to begin to be output prior to the 
completion of the filter.  Studying the apache.org docs, the book, 
and other modules, I haven't been able to get this working. Trying 
to merge the code from the docs with a sample module, the connection 
is closed after 751,143 bytes.

The apache.org docs here say this is important:

from http://httpd.apache.org/docs/2.3/developer/output-filters.html

   The above implementation would consume memory proportional to content
   size. 

   If passed a FILE bucket, for example, the entire file contents would
   be read into memory as each apr_bucket_read call morphed a FILE
   bucket into a HEAP bucket.

   In contrast, the implementation below will consume a fixed amount
   of memory to filter any brigade; a temporary brigade is needed and
   must be allocated only once per response, see the Maintaining state
   section.


    while ((e = APR_BRIGADE_FIRST(bb)) != APR_BRIGADE_SENTINEL(bb)) {
        rv = apr_bucket_read(e, &data, &length, APR_BLOCK_READ);
        if (rv) ...;
        /* Remove bucket e from bb. */
        APR_BUCKET_REMOVE(e);
        /* Insert it into temporary brigade. */
        APR_BRIGADE_INSERT_HEAD(tmpbb, e);
        /* Pass brigade downstream. */
        rv = ap_pass_brigade(f->next, tmpbb);
        if (rv) ...;
        apr_brigade_cleanup(tmpbb);
    } 


To learn about this using a simple module, I tried to patch Nick's 
mod_txt.c to include this functionallity:

typedef struct {
  ...
  apr_bucket_brigade *tmpbb;
} txt_ctxt ;

static int txt_filter_init(ap_filter_t* f) {
  txt_ctxt* ctxt = f->ctx = apr_palloc(f->r->pool, sizeof(txt_ctxt)) ;
  ...
  ctxt->tmpbb = apr_brigade_create(f->r->pool, f->c->bucket_alloc);
  return OK ;
}


static int txt_filter(ap_filter_t* f, apr_bucket_brigade* bb) {
...
    } else if ( apr_bucket_read(b, &buf, &bytes, APR_BLOCK_READ)
        == APR_SUCCESS ) {

      /* We have a bucket full of text.  Just escape it where necessary
      */
      size_t count = 0 ;
      const char* p = buf ;
      while ( count < bytes ) {
        size_t sz = strcspn(p, "<>&\"") ;
        count += sz ;
        if ( count < bytes ) {
          apr_bucket_split(b, sz) ;
          b = APR_BUCKET_NEXT(b) ;
          APR_BUCKET_INSERT_BEFORE(b, txt_esc(p[sz],
                f->r->connection->bucket_alloc)) ;
          apr_bucket_split(b, 1) ;
          APR_BUCKET_REMOVE(b) ;
          b = APR_BUCKET_NEXT(b) ;
          count += 1 ;
          p += sz + 1 ;
        }
      }

        APR_BUCKET_REMOVE(b);                        // <-- new code
        APR_BRIGADE_INSERT_HEAD(ctxt->tmpbb, b);     // <-- new code
        rv = ap_pass_brigade(f->next, ctxt->tmpbb);  // <-- new code
        apr_brigade_cleanup(ctxt->tmpbb);            // <-- new code
        apr_sleep(10000);                            // <-- new code

    }




testing:
$ wget -v -O /dev/null https://www.bettercgi.com/tmp/words.txt
--2011-11-14 12:02:23--  https://www.bettercgi.com/tmp/words.txt
Resolving www.bettercgi.com... 74.122.122.24
Connecting to www.bettercgi.com|74.122.122.24|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4953699 (4.7M) [text/html]
Saving to: “/dev/null”

15%
[================>                                                                    
                               ]
751,143     2.20M/s   in 0.3s    

2011-11-14 12:02:32 (2.20 MB/s) - Connection closed at byte 751143.


Without the temporary brigade, there wwas no error message, but also
no apparent improvement, as the content did not begin to download until 
after apr_sleep(10000), which represents a long running operation.

        APR_BUCKET_REMOVE(b);
        APR_BRIGADE_INSERT_TAIL(bb, b);
        ap_pass_brigade(f->next, bb);
        b = APR_BRIGADE_FIRST(bb);
        apr_sleep(10000);


Full source code for the original and the patched:

http://bettercgi.com/tmp/mod_txt.c
https://www.bettercgi.com/tmp/mod_patched.c
-- 
Ray Morris
support@bettercgi.com

Strongbox - The next generation in site security:
http://www.bettercgi.com/strongbox/

Throttlebox - Intelligent Bandwidth Control
http://www.bettercgi.com/throttlebox/

Strongbox / Throttlebox affiliate program:
http://www.bettercgi.com/affiliates/user/register.php





Mime
View raw message