httpd-docs mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Httpd Wiki] Update of "PerformanceScalingUp" by jmcg
Date Tue, 16 Nov 2010 21:08:28 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Httpd Wiki" for change notification.

The "PerformanceScalingUp" page has been changed by jmcg.
The comment on this change is: Adding TOC, removing numbering, cleaning Solaris un-facts.
Thanks to Sander..
http://wiki.apache.org/httpd/PerformanceScalingUp?action=diff&rev1=3&rev2=4

--------------------------------------------------

  
  See also: http://wiki.apache.org/httpd/PerformanceScalingOut
  
+ <<TableOfContents(2)>>
+ 
+ = Acknowledgment =
+ This documentation is based on the ApacheCon Presentations on
+ Performance and Scalability, by Sander S. Temme. The original
+ PDFs can be found at: http://people.apache.org/~sctemme/ApconUS2007/
+ 
- = 1     Introduction =
+ = Introduction =
  The Performance Tuning page in the Apache 1.3 documentation says:
  
         “Apache is a general webserver, which is designed to be correct first,
@@ -33, +40 @@

  In this paper and the ApacheCon session it accompanies, several
  aspects of web server performance will be discussed.
  
- == 1.1    What Will and Will Not Be Discussed ==
+ == What Will and Will Not Be Discussed ==
  The session will focus on easily accessible configuration and tuning options for
  Apache 1.3 and 2 as well as monitoring tools. Monitoring tools will allow you
  to observe your web server to gather information about its performance, or
@@ -43, +50 @@

  kernel. We do assume, though, that you have some familiarity with the Apache
  configuration file.
  
- = 2      Monitoring Your Server =
+ = Monitoring Your Server =
  The first task when sizing or performance-tuning your server is to find out how
  your system is currently performing. By monitoring your server under real-world
  load, or artificially generated load, you can extrapolate its behavior under stress,
  such as when your site is mentioned on Slashdot.
  
- == 2.1      Monitoring Tools ==
+ == Monitoring Tools ==
- === 2.1.1     top ===
+ === top ===
  The top tool ships with Linux and FreeBSD, and can be downloaded for Solaris.
  It collects a number of statistics for the system and for each running
  process, then displays them interactively on your terminal. The data displayed
@@ -68, +75 @@

  How to do this is described in Section 3.1.3. Top is, however, an interactive tool
  and running it continuously has few if any advantages.
  
- === 2.1.2     free ===
+ === free ===
  This command is only available on Linux. It shows how much memory and swap
  space is in use. Linux allocates unused memory as file system cache. The free
  command shows usage both with and without this cache. The free command
@@ -84, +91 @@

  Swap:       3903784      12540  3891244
  }}}
  
- === 2.1.3    vmstat ===
+ === vmstat ===
  This command is available on many unix platforms. It displays a large number
  of operating system metrics. Run without argument, it displays a status line for
  that moment. When a numeric argument is added, the status is redisplayed at
@@ -122, +129 @@

  generate three reports and then exit.
  
  
- === 2.1.4    SE Toolkit ===
+ === SE Toolkit ===
  The SE Toolkit is a system monitoring toolkit for Solaris. Its programming
  language is based on the C preprocessor and comes with a number of sample
  scripts. It can use both the command line and the GUI to display information.
@@ -140, +147 @@

  to bring to market a multiplatform monitoring tool built on the same principles
  as SE Toolkit, written in Java.
  
- === 2.1.5    mod_status ===
+ === mod_status ===
  The mod status module gives an overview of the server performance at a given
  moment. It generates an HTML page with, among others, the number of Apache
  processes running and how many bytes each has served, and the CPU load
@@ -149, +156 @@

  in your httpd.conf, the mod status page will give you more information at
  the cost of a little extra work per request.
  
- == 2.2    Web Server Log Files ==
+ == Web Server Log Files ==
  Monitoring and anaylizing the log files Apache writes is one of the most effective
  ways to keep track of your server health and performance. Monitoring the error
  log allows you to detect error conditions, discover attacks and find performance
@@ -159, +166 @@

  which allows you to predict when your performance needs will overtake your
  server capacity.
  
- === 2.2.1    Error Log ===
+ === Error Log ===
  The error log will contain messages if the server has reached the maximum
  number of active processes or the maximum number of concurrently open files.
  The error log also reflects when processes are being spawned at a higher-than-usual
@@ -187, +194 @@

  || debug || Debug-level messages ||
  
  
- The Error Log is configured through the ErrorLog and LogLevel configuration
+ The Error Log is configured through the `ErrorLog` and `LogLevel` configuration
  directives. The error log of Apache’s main server configuration receives
  the log messages that pertain to the entire server: startup, shutdown, crashes,
- excessive process spawns, etc. The ErrorLog directive can also be used in virtual
+ excessive process spawns, etc. The `ErrorLog` directive can also be used in virtual
  host containers. The error log of a virtual host receives only log messages
  specific to that virtual host, such as authentication failures and ‘File not Found’
  errors.
@@ -203, +210 @@

  You could block these attempts using a firewall or mod security6 , but this falls
  outside the scope of this discussion.
  
- The LogLevel directive determines the level of detail included in the logs.
+ The `LogLevel` directive determines the level of detail included in the logs.
  There are eight log levels, described in Table 1. The default log level is warn. A
  production server should not be run on debug, but increasing the level of detail
  in the error log can be useful during troubleshooting.
  
- === 2.2.2    Access Log ===
+ === Access Log ===
  Apache keeps track of every request it services in its access log file. In addition
  to the time and nature of a request, Apache can log the client IP address, date
  and time of the request, the result and a host of other information. The various
@@ -241, +248 @@

  || Status Code || 200  || Response code ||
  || Content Bytes || 9747 ||  Bytes transferred w/o headers ||
  
- === 2.2.3    Rotating Log Files ===
+ === Rotating Log Files ===
  There are several reasons to rotate logfiles. Firstly, they simply become too
  large to handle over time. Some operating systems have a hard file size limit of
  two Gigabytes. Secondly, any periodic log file analysis should not be performed
@@ -289, +296 @@

  timestamp suffix to its name. This method for rotating logfiles works well on
  unix platforms, but is currently broken on Windows.
  
- === 2.2.4    Logging and Performance ===
+ === Logging and Performance ===
  Writing entries to the Apache log files obviously takes some effort, but the information
  gathered from the logs is so valuable that under normal circumstances
  logging should not be turned off. For optimal performance, you should put
@@ -317, +324 @@

  lines in memory before writing them to disk. This might yield better performance,
  but could affect the order in which the server’s log is written.
  
- == 2.3      Generating A Test Load ==
+ == Generating A Test Load ==
  It is useful to generate a test load to monitor system performance under realistic
  operating circumstances. Besides commercial packages such as LoadRunner,
  there are a number of freely available tools to generate a test load against your
@@ -343, +350 @@

  is in production, the test load may negatively affect the server’s response. Also,
  any data traffic you generate may be charged against your monthly traffic allowance.
  
- = 3     Configuring for Performance =
+ = Configuring for Performance =
- == 3.1     Apache Configuration ==
+ == Apache Configuration ==
  The Apache 1.3 httpd is a pre-forking web server. When the server starts, the
  parent process spawns a number of child processes that do the actual work of
  servicing requests. Apache 2 introduced the concept of the Multi-Processing
@@ -364, +371 @@

  limit beyond which clients will be denied access. However, once requests start
  backing up, system performance is likely to degrade.
  
- === 3.1.1    MaxClients ===
+ === MaxClients ===
  The MaxClients directive in your Apache httpd configuration file specifies the
  maximum number of workers your server can create. It has two related directives,
  `MinSpareServers` and `MaxSpareServers`, which specify the number of
  workers Apache keeps waiting in the wings ready to serve requests. The absolute
  maximum number of processes is hard coded into Apache 1.3 as the parameter
- HARD SERVER LIMIT: in order to change it you’d have to recompile the
+ HARD_SERVER_LIMIT: in order to change it you’d have to recompile the
  server. Fortunately, most distributors have raised this limit well beyond the
  default of 256. In Apache 2.0, this limit is configurable through the `ServerLimit`
  directive.
  
- === 3.1.2    Spinning Threads ===
+ === Spinning Threads ===
  For Apache 1.3, or the prefork MPM of Apache 2.0, the above directives are
  all there is to determining the process limit. However, if you are running a
  threaded MPM the situation is a little more complicated. Threaded MPMs
- support the ThreadsPerChild directive14 . Apache requires that MaxClients is
+ support the `ThreadsPerChild` directive14 . Apache requires that `MaxClients` is
- evenly divisible by ThreadsPerChild. If you set either directive to a number
+ evenly divisible by `ThreadsPerChild`. If you set either directive to a number
  that doesn’t meet this requirement, Apache will send a message of complaint
- to the error log and adjust the ThreadsPerChild value downwards until it is an
+ to the error log and adjust the `ThreadsPerChild` value downwards until it is an
- even factor of MaxClients.
+ even factor of `MaxClients`.
  
- === 3.1.3    Sizing MaxClients ===
+ === Sizing MaxClients ===
  Optimally, the maximum number of processes should be set so that all the
  memory on your system is used, but no more. If your system gets so overloaded
  that it needs to heavily swap core memory out to disk, performance will degrade
- quickly. The formula for determining MaxClients is fairly simple:
+ quickly. The formula for determining `MaxClients` is fairly simple:
  
  {{{
                total RAM − RAM f or OS − RAM f or external programs
@@ -416, +423 @@

  and scripts that run outside the web server process. However, if you
  have a Java virtual machine running Tomcat on the same box it will need a
  significant amount of memory as well. The above assessment should give you
- an idea how far you can push MaxClients, but it is not an exact science. When
+ an idea how far you can push `MaxClients`, but it is not an exact science. When
- in doubt, be conservative and use a low MaxClients value. The Linux kernel
+ in doubt, be conservative and use a low `MaxClients` value. The Linux kernel
  will put extra memory to good use for caching disk access. On Solaris you need
  enough available real RAM memory to create any process. If no real memory is
  available, Apache will start writing ‘No space left on device’ messages to the error
- log and be unable to fork additional child processes, so a higher MaxClients
+ log and be unable to fork additional child processes, so a higher `MaxClients`
  value may actually be a disadvantage.
  
- === 3.1.4    Selecting your MPM ===
+ === Selecting your MPM ===
  The prime reason for selecting a threaded MPM is that threads consume fewer
  system resources than processes, and it takes less effort for the system to switch
  between threads. This is more true for some operating systems than for others.
@@ -449, +456 @@

  run PHP in the preforked MPM without fear of losing too much performance
  relative to the threaded option.
  
- === 3.1.5   Spinning Locks ===
+ === Spinning Locks ===
  Apache maintains an inter-process lock around its network listener. For all
  practical purposes, this means that only one httpd child process can receive
  a request at any given time. The other processes are either servicing requests
@@ -458, +465 @@

  only one process allowed in the door at any time. On a heavily loaded web
  server with requests arriving constantly, the door spins quickly and requests are
  accepted at a steady rate. On a lightly loaded web server, the process that
- currently “holds” the lock may have to stay in the door for a while, durin
+ currently “holds” the lock may have to stay in the door for a while, during
  which all the other processes sit idle, waiting to acquire the lock. At this
  time, the parent process may decide to terminate some children based on its
- MaxSpareServers directive.
+ `MaxSpareServers` directive.
  
  === 3.1.6   The Thundering Herd ===
  The function of the ‘accept mutex’ (as this inter-process lock is called) is to keep
@@ -485, +492 @@

  have a virtual host serving SSL requests), it will activate the accept mutex to
  avoid internal conflicts.
  
- You can manipulate the accept mutex with the AcceptMutex directive. Be-
+ You can manipulate the accept mutex with the `AcceptMutex` directive. Be-
  sides turning the accept mutex off, you can select the locking mechanism. Com-
  mon locking mechanisms include fcntl, System V Semaphores and pthread lock-
  ing. Not all are available on every platform, and their availability also depends
@@ -496, +503 @@

  matically recognizes the single listener situation described above and knows if
  it is safe to run without mutex on your platform.
  
- 3.2     Tuning the Operating System
+ == Tuning the Operating System ==
  People often look for the ‘magic tune-up’ that will make their system perform
  four times as fast by tweaking just one little setting. The truth is, present-day
  UNIX derivatives are pretty well adjusted straight out of the box and there is
  not a lot that needs to be done to make them perform optimally. However,
  there are a few things that an administrator can do to improve performance.
- 3.2.1    RAM and Swap Space
+ 
+ === RAM and Swap Space ==
  The usual mantra regarding RAM is “more is better”. As discussed above, unused
  RAM is put to good use as file system cache. The Apache processes get
  bigger if you load more modules, especially if you use modules that generate
@@ -524, +532 @@

  enough disk-based swap space available and the machine gets overloaded, it may
  get very, very slow as the system needs to swap memory pages to disk and back,
  but when the load decreases the system should recover. Remember, you still
- have MaxClients to keep things in hand.
+ have `MaxClients` to keep things in hand.
  
  Most unix-like operating systems use designated disk partitions for swap
  space. When a system starts up it finds all swap partitions on the disk(s), by
@@ -540, +548 @@

  on how to do this, see the manual pages for the mkswap and swapon or swap
  programs.
  
- === 3.2.2    ulimit: Files and Processes ===
+ === ulimit: Files and Processes ===
  Given a machine with plenty of RAM and processor capacity, you can run
  hundreds of Apache processes if necessary. . . and if your kernel allows it. The
  Linux 2.2 kernel series by default limited the number of processes a user can
@@ -572, +580 @@

  }}}
  command. Once again, this must be done prior to starting Apache.
  
- === 3.2.3   Setting User Limits on Linux System Startup ===
+ === Setting User Limits on System Startup ===
  Under Linux, you can set the ulimit parameters on bootup by editing the
  `/etc/security/limits.conf` file. This file allows you to set soft and hard limits
  on a per-user or per-group basis; the file contains commentary explaining the
@@ -584, +592 @@

  All items can have a ‘soft’ and a ‘hard’ limit: the first is the default setting
  and the second the maximum value for that item.
  
- Solaris does not seem to have a similar mechanism for manipulating limit
- values at boot time: you will have to set them in your startup script(s).
+ Solaris has a similar mechanism for manipulating limit values at boot time:
+ In `/etc/system` you can set kernel tunables valid for the entire system at
+ boot time. These are the same tunables that can be set with the `mdb` kernel debugger
+ during run time.
+ The soft and hard limit corresponding to ulimit -u can be set via:
+ {{{
+ set rlim_fd_max=65536
+ set rlim_fd_cur=2048
+ }}}
+ Solaris calculates the maximum number of allowed processes per user (`maxuprc`)
+ based on the total amount available memory on the system (`maxusers`). You can
+ review the numbers with
+ {{{
+ sysdef -i | grep maximum
+ }}}
+ but it is not recommanded to change them.
  
+ 
- === 3.2.4    Turn Off Unused Services and Modules ===
+ === Turn Off Unused Services and Modules ===
  Many UNIX and Linux distributions come with a slew of services turned on by
  default. You probably need few of them. For example, your web server does
  not need to be running sendmail, nor is it likely to be an NFS server, etc. Turn
  them off.
  
  On Red Hat Linux, the chkconfig tool will help you do this from the command
+ line. On Solaris systems `svcs` and `svcadm` will show which services are enabled
+ and disable them respectively.
- line. On Solaris systems, my approach is to inspect the /etc/rc[123].d directories
- and to change the first character of the name of startup scripts I don’t want to
- start automatically from S to s. Since the Solaris file system is case sensitive, this
- disables services without actually altering them so they become unrecognizable.
- 
- Thus, S88sendmail becomes s88sendmail. This way, the init process will
- pass them over but it’s still evident to other sysadmins that they were once
- active. While Solaris transitions through subsequent run levels on startup, the
- Linux initialization just executes all the scripts in the default run level directory.
- The default run level for a Linux web server should be 3: you don’t need to run
- an X-Windows desktop on a web server so level 5 should not be necessary.
  
  In a similar fashion, cast a critical eye on the Apache modules you load. Most
  binary distributions of Apache, and pre-installed versions that come with Linux
- distributions, have their modules enabled through the LoadModule directive.
+ distributions, have their modules enabled through the `LoadModule` directive.
  
  A notable exception is the Apache httpd on Cobalt Raq servers, which has
- mod perl compiled statically to run the GUI–despite the fact that the GUI
+ mod_perl compiled statically to run the GUI–despite the fact that the GUI
  Apache is running as an entirely different process from the one doing the actual
- serving. You cannot disable this instance of mod perl. Other modules,
+ serving. You cannot disable this instance of mod_perl. Other modules,
  however, may be culled: if you don’t use their functionality and configuration
- directives, you can turn them off by commenting out the corresponding LoadModule
+ directives, you can turn them off by commenting out the corresponding `LoadModule`
  lines. Read the documentation18 on each module’s functionality before
  deciding whether to keep it enabled. While the performance overhead of an
  unused module is small, it's also unnecessary.
  
- = 4     Caching Content =
+ = Caching Content =
  Requests for dynamically generated content usually take significantly more resources
  than requests for static content. Static content consists of simple filespages,
  images, etc.–on disk that are very efficiently served. On platforms that
@@ -637, +651 @@

  by turning popular dynamic requests into static requests. In this section, two
  approaches to this will be discussed.
  
- == 4.1     Making Popular Pages Static ==
+ == Making Popular Pages Static ==
  By pre-rendering the response pages for the most popular queries in your application,
  you can gain a significant performance improvement without giving
  up the flexibility of dynamically generated content. For instance, if your application
  is a flower delivery service, you would probably want to pre-render
- your catalog pages for red roses during the weeks leading up to Valentine’s Day.
+ your catalog pages for red roses during the weeks leading up to Valentine's Day.
  When the user searches for red roses, they are served the pre-rendered page.
  Queries for, say, yellow roses will be generated directly from the database. The
  mod rewrite module included with Apache is a great tool to implement these
  substitutions.
  
- === 4.1.1    Example: A Statically Rendered Blog ===
+ === Example: A Statically Rendered Blog ===
  Blosxom19 is a lightweight web log package that runs as a CGI. It is written in
  Perl and uses plain text files for entry input. Besides running as CGI, Blosxom
  can be run from the command line to pre-render blog pages. Pre-rendering
@@ -656, +670 @@

  that large numbers of people actually start reading your blog.
  
  To run blosxom for static page generation, edit the CGI script according to
- the documentation. Set the $static dir variable to the DocumentRoot of the
+ the documentation. Set the $static dir variable to the `DocumentRoot` of the
  web server, and run the script from the command line as follows:
  {{{
- $ perl blosxom.cgi -password=’whateveryourpassword’
+ $ perl blosxom.cgi -password='whateveryourpassword'
  }}}
  This can be run periodically from Cron, after you upload content, etc. To
  make Apache substitute the statically rendered pages for the dynamic content,
  we’ll use mod rewrite. This module is included with the Apache source code,
  but is not compiled by default. It can be built with the server by passing the
- option --enable-rewrite[=shared] to the configure command. Many binary
+ option `--enable-rewrite[=shared]` to the configure command. Many binary
  distributions of Apache come with mod rewrite included. The following is an
  example of an Apache virtual host that takes advantage of pre-rendered blog
  pages:
@@ -711, +725 @@

  </VirtualHost>
  }}}
  
- == 4.2     Caching Content With mod cache ==
+ == Caching Content With mod cache ==
+ The mod cache module provides
- As described in [8], mod cache is no longer considered experimental in httpd 2.2
- and is now included in the base distribution. The mod cache module provides
  intelligent caching of HTTP responses: it is aware of the expiration timing and
  content requirements that are part of the HTTP specification. The mod cache
  module caches URL response content. If content sent to the client is considered
@@ -721, +734 @@

  directly from the cache. The provider module for mod cache, mod mem cache or
  mod disk cache, determines whether the cached content is stored on disk or in
  memory. Most server systems will have more disk available than memory, and
- it’s good to note that some operating system kernels cache frequently accessed
+ it's good to note that some operating system kernels cache frequently accessed
  disk content transparently in memory.
  
  To enable efficient content caching and avoid presenting the user with stale
@@ -736, +749 @@

  of Apache, or it came with your port or package collection, it may have
  mod cache already included.
  
- === 4.2.1    Example: wiki.apache.org ===
+ === Example: wiki.apache.org ===
  The Apache Software Foundation Wiki is served by MoinMoin. MoinMoin
  is written in Python and runs as a CGI. To date, any attempts to run it under
  mod_python has been unsuccessful. The CGI proved to place an untenably
@@ -745, +758 @@

  Apache Infrastructure team turned to mod cache. It turned out MoinMoin
  needed a small patch to ensure proper behaviour behind the caching server:
  certain requests can never be cached and the corresponding Python modules
- were patched to send the proper HTTP response headers. After this modifica-
+ were patched to send the proper HTTP response headers. After this modification,
- tion, the cache in front of the Wiki was enabled with the following configuration
+ the cache in front of the Wiki was enabled with the following configuration
- snippet in httpd.conf:
+ snippet in `httpd.conf`:
  
  {{{
  CacheRoot /raid1/cacheroot

---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org


Mime
View raw message