httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Trawick <traw...@gmail.com>
Subject Re: [users@httpd] tuning question
Date Sat, 12 Jul 2014 21:34:57 GMT
On Sat, Jul 12, 2014 at 5:06 PM, Miles Fidelman <mfidelman@meetinghouse.net>
wrote:

> Jeff Trawick wrote:
>
>  On Sat, Jul 12, 2014 at 1:25 PM, Miles Fidelman <
>> mfidelman@meetinghouse.net <mailto:mfidelman@meetinghouse.net>> wrote:
>>
>>     Hi Folks,
>>
>>     Ever once in a while, a crawler comes along and starts indexing
>>     our site - and in the process pushes our server's load average
>>     through the roof.
>>
>>     Short of blocking the crawlers, can anybody suggest some quick
>>     tuning adjustments to make, to reduce load (setting the max.
>>     number of servers and/or requests, renicing processes)?
>>
>>
>> Use robots.txt to block access to dynamically generated resources which
>> are
>> expensive to generate and not necessary for search hits?
>>
>> Is it using a lot of concurrent requests, or is the main load issue due to
>> the cost of the requests it is making?
>>
>>  a bit of both



If you want to limit concurrent requests just from web crawlers, try
something like mod_qos.  (See
http://unix.stackexchange.com/questions/37481/throttling-web-crawlers)

If it were me, I'd try to block needless, expensive requests with
robots.txt too.  http://www.robotstxt.org/robotstxt.html



>
>
>
> --
> In theory, there is no difference between theory and practice.
> In practice, there is.   .... Yogi Berra
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
>
>


-- 
Born in Roswell... married an alien...
http://emptyhammock.com/

Mime
View raw message