Return-Path: X-Original-To: apmail-incubator-allura-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-allura-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4FC8F1084A for ; Mon, 26 Aug 2013 22:44:31 +0000 (UTC) Received: (qmail 98094 invoked by uid 500); 26 Aug 2013 22:44:31 -0000 Delivered-To: apmail-incubator-allura-dev-archive@incubator.apache.org Received: (qmail 98057 invoked by uid 500); 26 Aug 2013 22:44:18 -0000 Mailing-List: contact allura-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: allura-dev@incubator.apache.org Delivered-To: mailing list allura-dev@incubator.apache.org Received: (qmail 98049 invoked by uid 99); 26 Aug 2013 22:44:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Aug 2013 22:44:18 +0000 X-ASF-Spam-Status: No, hits=-0.1 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of noreply@sourceforge.net designates 216.34.181.60 as permitted sender) Received: from [216.34.181.60] (HELO smtp.ch3.sourceforge.com) (216.34.181.60) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Aug 2013 22:44:14 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=sourceforge.com; s=x; h=Date:Message-ID:Subject:Reply-To:From:To:MIME-Version:Content-Type; bh=/YZv/fVr8bdO7kHb4YtFyjXEEpxDCIsX7E1VLd5jgUU=; b=AL7nHW1IQlws2QdDno3gP9I8iwpZDO+6nm95M8mZxx5tZs83JM+Wgjvn0UKq+HXChoy/uHYYwov57SbEgRy62O1Anw/LHmkm6maSADsnbfEw7C7ZSEbuZt9G1OLPBak86/0JEqs0gH4QCkJ1UbzOIp0RwsSu0PQ2r5ut65jmJ7s=; Received: from localhost ([127.0.0.1] helo=sfs-alluradaemon-2.v29.ch3.sourceforge.com) by sfs-alluradaemon-2.v29.ch3.sourceforge.com with esmtp (Exim 4.76) (envelope-from ) id 1VE5Vp-0002CU-Si for allura-dev@incubator.apache.org; Mon, 26 Aug 2013 22:43:53 +0000 Content-Type: multipart/related; boundary="===============5660432398249352117==" MIME-Version: 1.0 To: "[allura:tickets] " <6595@tickets.allura.p.re.sf.net> From: "Dave Brondsema" Reply-To: "[allura:tickets] " <6595@tickets.allura.p.re.sf.net> Subject: [allura:tickets] #6595 Prevent spiders from requesting tarballs Message-ID:

Date: Mon, 26 Aug 2013 22:43:53 +0000 X-Virus-Checked: Checked by ClamAV on apache.org --===============5660432398249352117== Content-Type: multipart/alternative; boundary="===============6377937189874436259==" MIME-Version: 1.0 --===============6377937189874436259== MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit If I do a GET on a rev that has no tarball ever requested, it says "Checking snapshot status..." and does ajax checks which return 'na' over and over. We need some way let the user request the snapshot (put a POST form button right on that page?). A smaller initial delay is great, but `// Check tarball status every 5 seconds` should be removed since it's inaccurate now. The upper limit of 600,000ms seems pretty high too, might be good to drop that down while you're in there. --- ** [tickets:#6595] Prevent spiders from requesting tarballs** **Status:** in-progress **Labels:** stability **Created:** Thu Aug 22, 2013 03:43 PM UTC by Dave Brondsema **Last Updated:** Mon Aug 26, 2013 02:02 PM UTC **Owner:** Tim Van Steenburgh The following are examples of spiders requesting tarball creation. This is unnecessary and a waste of resources. We should make it not possible. We already have `rel=nofollow` but that apparently isn't working. I think the best solution is to require the URL to be a POST. ~~~~ "GET /p/z-i/code-0/208/tarball HTTP/1.0" 200 16400 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" "GET /p/jhotdraw/svn/729/tarball HTTP/1.0" 200 17834 "-" "msnbot/0.01 (+http://search.msn.com/msnbot.htm)" "GET /p/fourpane/git4pane/ci/ec65df3a5ff2ec7be011c0722286e766c2b76d94/tarball HTTP/1.0" 200 18137 "-" "Mozilla/5.0 (compatible; MJ12bot/v1.4.3; http://www.majestic12.co.uk/bot.php?+)" "GET /u/lluct/me722-cm/ci/0aa649648a00979ad6ca9e9d61df4e44eb694259/tarball?path=/external/clang HTTP/1.0" 200 17918 "-" "YisouSpider" ~~~~ --- Sent from sourceforge.net because allura-dev@incubator.apache.org is subscribed to https://sourceforge.net/p/allura/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/allura/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. --===============6377937189874436259== MIME-Version: 1.0 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: 7bit

If I do a GET on a rev that has no tarball ever requested, it says "Checking snapshot status..." and does ajax checks which return 'na' over and over. We need some way let the user request the snapshot (put a POST form button right on that page?).

A smaller initial delay is great, but // Check tarball status every 5 seconds should be removed since it's inaccurate now. The upper limit of 600,000ms seems pretty high too, might be good to drop that down while you're in there.


[tickets:#6595] Prevent spiders from requesting tarballs

Status: in-progress
Labels: stability
Created: Thu Aug 22, 2013 03:43 PM UTC by Dave Brondsema
Last Updated: Mon Aug 26, 2013 02:02 PM UTC
Owner: Tim Van Steenburgh

The following are examples of spiders requesting tarball creation. This is unnecessary and a waste of resources. We should make it not possible. We already have rel=nofollow but that apparently isn't working. I think the best solution is to require the URL to be a POST.

"GET /p/z-i/code-0/208/tarball HTTP/1.0" 200 16400 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
"GET /p/jhotdraw/svn/729/tarball HTTP/1.0" 200 17834 "-" "msnbot/0.01 (+http://search.msn.com/msnbot.htm)"
"GET /p/fourpane/git4pane/ci/ec65df3a5ff2ec7be011c0722286e766c2b76d94/tarball HTTP/1.0" 200 18137 "-" "Mozilla/5.0 (compatible; MJ12bot/v1.4.3; http://www.majestic12.co.uk/bot.php?+)"
"GET /u/lluct/me722-cm/ci/0aa649648a00979ad6ca9e9d61df4e44eb694259/tarball?path=/external/clang HTTP/1.0" 200 17918 "-" "YisouSpider"

Sent from sourceforge.net because allura-dev@incubator.apache.org is subscribed to https://sourceforge.net/p/allura/tickets/

To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/allura/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.

--===============6377937189874436259==-- --===============5660432398249352117==--