httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Knight" <boh...@gmail.com>
Subject Re: Dummy vhost for intruders - comments please
Date Wed, 17 Dec 2008 18:35:44 GMT
On Tue, Dec 16, 2008 at 3:13 PM, Peter Horn <peter.horn@bigpond.com> wrote:
> I don't think this is quite off-topic, just a bit left of centre. :-\
> I run a small site with two subdomains of no-ip.org (like dyndns) using
> NameVirtualHost. Looking at the access log, a few percent of my traffic was
> from bots like Morfeus F***ing Scanner [my censorship], intrusion attempts
> (e.g. GET /login_page.php) and just plain old "wrong numbers". Nothing from
> what I'd think of as "good" bots (Google, etc.) Initially, I added a first
> (i.e. default) vhost to serve a page saying "If you don't know the URL, I'm
> not telling you." Then I refined this with the obvious "Deny from all".

I suppose this is something you can do now.  When I first started
using name based virtual hosting my first vhost was a simple page that
informed the reader that they had hit this page because their browser
did not support HTTP/1.1 requests and had links to the latest
browsers.  I only got bitten by this once, when a friend using a
Hughes satellite connection that utilized a HTTP/1.0 proxy to improve
perceived speed couldn't get to her sites and got really really really
mad at me.

> While this is definitely effective, do you consider it
> honourable/ethical/sneaky/clever/dumb/whatever? Are there any likely
> side-effects?

My opinion is that it is your server and you can do what you want with
it.  I have always been bothered with the 'robot exclusion protocol'
because the concept is that any commercial business can scan and copy
your content by default, unless you find them and exclude them.
archive.org is a personal pet peeve of mine, though I am sure I am in
the minority there.

With the goal of catching the bad bots, here is another idea.  Create
a subdirectory off your site that has a single index.php (or whatever
your preferred server-side scripting language is) and have that file
append the site's .htaccess file with a deny from [REMOTE_ADDR of the
request].  Then put that directory in your robots.txt file.  Only the
really evil bots deliberately crawl the excludes in a robots.txt, and
once they do you'll be blocking their requests.

-Chris

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Mime
View raw message