On Feb 26, 2012, at 12:52 PM, Nick Kew wrote:
We're already using the
<link rel="canonical" href="http://httpd.apache.org/docs/current/"/>
to tell Google not to index the pages, although that's not (yet) on all of the 1.3 doc pages - Unfortunately that's something of a manual process due to the fact that the 1.3 docs are in HTML, not generated, and that not every page in the 1.3 docs has an exact corollary in the /current/ docs.
That's what robots.txt is for! Surely we can use that to stop indexing 2.0
as well as 1.3? Maybe even 2.2 once 2.4 is windows-ready and in the distros?
The rel canonical thing is a way to actively update the Google index for a particular page and search term, and has been very effective in updating certain searches. For example, searching Google for "rewriterule" has long given the 1.3 Rewrite Guide, but within 24 hours of adding a rel canonical tag, it started pointing to the 2.2 mod_rewrite docs as the top hit.