httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From TOKI...@aol.com
Subject Re: Killer malloc() in apr_sendv()
Date Fri, 17 Aug 2001 18:34:18 GMT

In a message dated 01-08-17 17:51:20 EDT, Bill Stoddard wrote...

>  > >  I've replaced the malloc/frees in the bucket code on Windows with a 
> power 
>  > of 
>  > > 2 allocator
>  > >  and it makes a BIG difference in performance. I expect the same on 
> every 
>  > OS 
>  > > with the
>  > >  exception of Linux.
>  > 
>  > Kevin Kiley wrote...
>  >
>  > You might want to add the same 'power of 2' stuff to a call 
>  > that is absolutely killing the windows version.
>  > 
>  > apr_sendv() in /srclib/apr/network_io/win32/sendrecv.c
>  > is using a malloc()/free() combo for only 8 bytes!
>  > It's using WSASend() and can 'scatter gather' but so far
>  > I've never seen more than 2 separate buffers to 'gather'
>  > on any particular call to apr_sendv().
>  > 
>  > There is a comment there about 'putting it on the stack' but
>  > you'd be better off doing this sooner than later because
>  > the malloc()/free() for only 8 bytes is killing you on every
>  > transmit.
>  > 
>  
>  I'll do it this weekend. 
>  Thanks,
>  
>  Bill

FWIW... I did a quick test here and easily added more than 
60 transactions per second on a 10000 hit simple home
page grab crunch test with 2.0.24.

Here is the code I used to change WSASend allocations
to stack rather than malloc()/free()

NOTE: Code below still preserves ability to 'malloc()' if some
absurd number of scatter gather 'nvecs' arrives. The reason
for the 2 execution paths is for speed and limit 'if' logic to
just one iteration at all times. Even with the way the filtering
can fragment things I really doubt it will ever need
more than 500 iovecs on any one call but who knows.

I am sure you will do whatever you want anyway so this
apr_sendv() rewrite is obviously just a suggestion...

[snip]

APR_DECLARE(apr_status_t) apr_sendv(apr_socket_t *sock,
                                    const struct iovec *vec,
                                    apr_int32_t nvec, apr_size_t *nbytes)
{
    apr_ssize_t rv;
    int i;
    int lasterror;
    DWORD dwBytes = 0;

    /* sizeof(WSABUF) is 8 bytes so 500 entries takes 4k.. */

    #define APR_SENDV_MAX_STACK_BUFFERS 500
    #define APR_SENDV_MAX_STACK_BUFFERS_SIZE (500*8)
    char wsabuf_stack_buffer[ APR_SENDV_MAX_STACK_BUFFERS_SIZE ];
    pWsaData_on_the_stack = (LPWSABUF) &wsabuf_stack_buffer[0];

    /* Todo: Put the WSABUF array on the stack. */
    LPWSABUF pWsaData;

    /* Use 2 separate execution paths for speed and avoid */
    /* over-use of 'if' statements around 'free()' calls... */

    /* There's probably no way anyone is ever going to need */
    /* need more than 500 WSABUF records so put the most used */
    /* condition FIRST for fastest overall pickup time... */

    if ( nvec < APR_SENDV_MAX_STACK_BUFFERS )
      {
       /* No need to malloc()... just use the stack... */

       for (i = 0; i < nvec; i++)
          {
           pWsaData_on_the_stack[i].buf = vec[i].iov_base;
           pWsaData_on_the_stack[i].len = vec[i].iov_len;
          }

       rv = WSASend(
       sock->sock, pWsaData_on_the_stack, nvec, &dwBytes, 0, NULL, NULL);

       if (rv == SOCKET_ERROR) {
         lasterror = apr_get_netos_error();
         return lasterror;
        }
      }
    else /* Ouch! There are more than 500 'scatter gather' buffers! */
      {
       /* Only thing we can do is use malloc() if there are that */
       /* many actual buffers to 'gather'... */

       pWsaData = (LPWSABUF) malloc(sizeof(WSABUF) * nvec);

       if (!pWsaData)
         {
          return APR_ENOMEM;
         }

       for (i = 0; i < nvec; i++)
          {
           pWsaData[i].buf = vec[i].iov_base;
           pWsaData[i].len = vec[i].iov_len;
          }

       rv = WSASend(sock->sock, pWsaData, nvec, &dwBytes, 0, NULL, NULL);

       if (rv == SOCKET_ERROR) {
         lasterror = apr_get_netos_error();
         free(pWsaData);
         return lasterror;
        }

       free(pWsaData);

      }/* End 'else( malloc()/free() was needed )' */

    *nbytes = dwBytes;

    return APR_SUCCESS;
}

[snip]




Mime
View raw message