httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dale's stuff <>
Subject Re: Wget
Date Mon, 26 Aug 2002 12:01:25 GMT

On Monday, August 26, 2002, at 12:09  PM, Rodent of Unusual Size wrote:

> Boyle Owen wrote:
>> What do you have against wget? If you put pages on the web, they are 
>> publically
>> available so what do you care what agent people use to browse them?

Ahh.... because too many people use such a tool for purposes of the dark 
side - e.g. stealing my content and images and so forth.

> Wget is hardly a browser.  I have it blocked because I kept getting 
> hammered
> by recursive site scraping, and found some of my pages reproduced wlsewhere
> as a consequence.

I have encountered the same thing, with people doing what appears to be a 
DoS attack on my server.

>> If it's just that you don't like robots because they don't read your 
>> adverts,
>> then create a robots.txt file in your docroot and "Disallow: /" (see
> Does wget honour robots.txt?

By default yes, however, there is a command to let wget ignore the robots.
txt file.

Also, you can have wget masquerade as a different browser.

> --
> #ken	P-)}
> Ken Coar, Sanagendamgagwedweinini  http://Golux.Com/coar/
> Author, developer, opinionist      http://Apache-Server.Com/
> "Millennium hand and shrimp!"


The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:> for more info.
To unsubscribe, e-mail:
   "   from the digest:
For additional commands, e-mail:

View raw message