Roy T. Fielding wrote: > A better optimization might be to reduce the number of calls to > brigade_puts. That's how much of 1.3 was improved. I only know of three ways to reduce the number of apr_brigade_puts() calls in 2.0: * Send fewer fields in the HTTP response header. * Or do more buffering prior to calling apr_brigade_puts(). (This is what 2.0 used to do, and it was even slower, because it added yet another layer of memory copying before the socket write.) * Or produce a separate bucket for each field in the response header, and rely on writev to patch them together. (This won't work in 2.0; if the number of tiny buckets grows too large, core_output_filter() will try to consolidate them into a single bucket, with the associated memcpy cost.) Were you thinking of a different approach from these? --Brian