www-infrastructure-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Weir <robw...@apache.org>
Subject Google Indexing of ooo-site.apache.org
Date Tue, 26 Aug 2014 15:11:47 GMT
I did the following search in Google:

Apache OpenOffice site:ooo-site.apache.org

That searches for Apache OpenOffice but limits results to pages on the
ooo-site.apache.org website.   Google finds 3,470 pages.

I expected to find zero pages, since these should all be exposed as
www.openoffice.org pages.

As I understand it, Google penalizes duplicate content.  Having pages
listed twice, under two different URL's, looks "spammy" to them.

Does anyone here have experience with this, or have ideas how to fix
it?  Do we need a robots.txt on ooo-site?    A rel="canonical" header
in the HTML files?   A different form of redirect?   Any
recommendations?   The goal should be to have only the
*.openoffice.org URLs be seen by search engine spiders.



View raw message