httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Stein <gst...@lyra.org>
Subject Re: Ban MSIECrawler
Date Sun, 26 Dec 1999 09:53:42 GMT
On Sun, 26 Dec 1999, Marc Slemko wrote:
> Just a FYI; if you haven't already (and most people probably have), make
> sure to ban anything resembling:
> 
> Mozilla/4.0 (compatible; MSIE 4.01; MSIECrawler; Windows 95)
> 
> from your site.  This is, AFAIK, IE in it's lame-ass "kill the web" mode.
> 
> I just saw a site get pummelled by over 250 hits per second from a bunch
> of users using this.  They weren't doing anything special, I have no
> reason to think there was any planned DoS attack.  The site simply had
> something like:
> 
> ErrorDocument 404 http://site.example.com/notthere.html
> 
> in a .htaccess, ...where /notthere.html didn't exist on the site.
> 
> So whenever this pile of junk got a 404, it ended up getting stuck in a
> loop of redirects to the same page it was already on.  How can MS release
> software like this?

Actually, it is a pretty nice feature to yank down a bunch of pages so
that they will be available when you disconnect your laptop from the
Internet. I'm not familiar with a similar feature in any other browser.
It's kind of nice, actually, to have a copy of a web site on the plane
with you for reference while you work.

If a site continues to return a redirect on an error, then what is the
client supposed to do? It should go get the other document.

Personally... I can easily understand the fetch loop that results. Since
an error occurred, it doesn't have any of the documents in its cache, so
it figures it has to go get it. Easy mistake to make, I'd say -- recording
what is in the cache as opposed to what was requested.

And it takes a real serious brain on a tester to think, "hey... how about
if I misconfigure the server in *this* way to see what IE does." There are
probably three testers on the planet that might have thought of that.

IMO, don't ban MSIECrawler (that'll just piss off users trying to cache
your site). Fix the ErrorDocument response.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


Mime
View raw message