Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 90269 invoked from network); 8 Feb 2010 14:39:50 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 8 Feb 2010 14:39:50 -0000 Received: (qmail 15447 invoked by uid 500); 8 Feb 2010 14:39:50 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 15371 invoked by uid 500); 8 Feb 2010 14:39:49 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 15361 invoked by uid 99); 8 Feb 2010 14:39:48 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Feb 2010 14:39:48 +0000 X-ASF-Spam-Status: No, hits=-1997.3 required=10.0 tests=ALL_TRUSTED,FS_REPLICA,WEIRD_PORT X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Feb 2010 14:39:48 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id E1C4329A0011 for ; Mon, 8 Feb 2010 06:39:27 -0800 (PST) Message-ID: <321843887.121431265639967923.JavaMail.jira@brutus.apache.org> Date: Mon, 8 Feb 2010 14:39:27 +0000 (UTC) From: "Enda Farrell (JIRA)" To: dev@couchdb.apache.org Subject: [jira] Created: (COUCHDB-641) Should replication of recently purged documents kee trying? MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Should replication of recently purged documents kee trying? ----------------------------------------------------------- Key: COUCHDB-641 URL: https://issues.apache.org/jira/browse/COUCHDB-641 Project: CouchDB Issue Type: Bug Components: Replication Affects Versions: 0.9 Environment: couchdb 0.9.0.r766883 CentOS x86_64 Reporter: Enda Farrell We had a large doc, with 100000s of revisions which was having trouble replicating. (Let's ignore the why which is *probably* down to our networking). We use pull replication on this 0.9 installation. We wanted to remove that particular doc (as we could do) from teh databases so that the replicater would not have to keep trying to replicate it. First we deleted it. Of course this meant that the replication record for the doc had yet another entry for it. We then compacted the database - hoping to reduce the number of revisions - but of course this wouldn't work either. We then purged all open revisions of the doc from the source database, but the target still tried to replicate this doc. And tried. And tried, eventually causing the server to crash. {code} Mon, 08 Feb 2010 09:56:10 GMT] [error] [<0.3542.0>] couch_rep HTTP get request failed after 10 retries: http://kv101.back.live.telhc.local:5986/madcache/MAD__mutex ?revs=true&latest=true&open_revs=["15799-4207095478",....,"7286-464196713"] [Mon, 08 Feb 2010 09:56:11 GMT] [error] [<0.3542.0>] replicator terminating with reason {http_request_failed, [104,116,116,112,58,47,47,107,118,49, 48,49,46,98,97,99,107,46,108,105,118, etc etc etc{code} In the above there were 900+ open revisions. The *question* is this: _should_ the replicater still try to replicate docs which have been purged from the source? * It is possible that this bug is invalid on 0.9+/0.10.x/0.11 - we haven't the ability to re-create the scenario. * It is also possible that the [COUCHDB-416] fix has also fixed this - we haven't upgraded enough environments yet to verify, even if we could re-create the scenario * It's OK that the replicater had tried to replicate all those revs as they did indeed once exist - it's only whether it should recognise that it cannot access them any longer and therefore stop requesting it. Our work around was to delete the target database entirely, restart the CouchDB instance, re-create a new database of the same name and re-replicate. Such a process is not always going to be available as an option on live production environments. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.