www-apache-bugdb mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Slemko <ma...@znep.com>
Subject Re: mod_dir/1057: Web robots should be told not to index auto-generated index pages
Date Thu, 28 Aug 1997 16:10:04 GMT
The following reply was made to PR mod_dir/1057; it has been noted by GNATS.

From: Marc Slemko <marcs@znep.com>
To: Olly Betts <olly@muscat.co.uk>
Subject: Re: mod_dir/1057: Web robots should be told not to index auto-generated index pages

Date: Thu, 28 Aug 1997 10:04:44 -0600 (MDT)

 
 On Thu, 28 Aug 1997, Olly Betts wrote:
 
 > In message <199708271747.KAA13158@hyperreal.org>, brian@hyperreal.org writes:
 > >Synopsis: Web robots should be told not to index auto-generated index pages
 > >
 > >State-Changed-From-To: open-closed
 > >State-Changed-By: brian
 > >State-Changed-When: Wed Aug 27 10:47:33 PDT 1997
 > >State-Changed-Why:
 > >We talked about it on the developers list, and don't necessarily
 > >agree that index pages shouldn't be indexed by robots.  If
 > >you want to add custom META tags to your pages, you can set
 > >"IndexOptions SuppressHTMLPreamble", and then put a full HTML <HEAD>
 > >section in HEADER.html in each directory.
 > >
 > >
 > 
 > However, this relies on a majority of web page authors being savvy enough to
 > know about the protocol, get their admin to add the IndexOptions line and to
 > remember to copy HEADER.html into every directory.  I think this is at best
 > optimistic.
 > 
 > Does anyone really disagree that marking auto-index pages as
 > "noindex,follow" *by default* is not a good idea?  This is what my
 > suggestion amounts to, since it could be overridden as you describe.
 
 Yes.  It is not a good idea.  Index pages can have a lot more than a
 directory index in them.  They can have headers, footers, file
 descriptions, none of which will necessarily appear anywhere else.
 
 This probably would be accepted as an IndexOptions setting if a patch were
 made, but default probably wouldn't be enabled.
 
 > 
 > The real problem robots have with the current situation is that (assuming
 > the robot author even appreciates the problem) it is hard to come up with a
 > reliable way to determine if a page is an auto-generated index page.
 > 
 > Olly
 > 
 

Mime
View raw message