Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 83942 invoked from network); 21 Feb 2011 18:19:15 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 21 Feb 2011 18:19:15 -0000 Received: (qmail 34122 invoked by uid 500); 21 Feb 2011 18:19:13 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 33973 invoked by uid 500); 21 Feb 2011 18:19:10 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 33965 invoked by uid 99); 21 Feb 2011 18:19:09 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Feb 2011 18:19:09 +0000 X-ASF-Spam-Status: No, hits=3.6 required=5.0 tests=FS_REPLICA,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of wayne@databill.com designates 68.142.98.194 as permitted sender) Received: from [68.142.98.194] (HELO mail2.databill.com) (68.142.98.194) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Feb 2011 18:19:03 +0000 Received: from [10.0.0.15] ([::ffff:10.0.0.15]) (AUTH: LOGIN wayne@databill.com) by mail2.databill.com with esmtp; Mon, 21 Feb 2011 11:18:42 -0700 id 001E0376.4D62AC82.00005A38 Message-ID: <4D62AC7F.2040708@databill.com> Date: Mon, 21 Feb 2011 11:18:39 -0700 From: Wayne Conrad User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.15) Gecko/20101030 Icedove/3.0.10 MIME-Version: 1.0 To: user@couchdb.apache.org Subject: Replication: stalled? Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit I'm seeing replication behavior that I don't understand. I wonder if it's stalled. I've got two couchdb servers, each with four databases. A cron job runs once a minute and tells each server to do continuous replication from each database on the other server. What I'm seeing for one of the databases has me confused: I see, in the couch.log for the "source" database, 'GET' entries consistent with the source database fetching documents and their attachments. But on the destination database, the fetched documents and attachments do not appear. To answer the question, "Are the GET entries coming from some other instance of couchdb?", I stopped couchdb on the destination server. The 'GET' entries in the log of the source server stopped. I then restarted the destination couchdb server and the log entries resumed. Futon on the source database shows: Overview: Name: ps Size: 181.5 GB Number of Documents: 43,090 Update Seq: 43,741 Status: Type: Replication Object: 49a5c5: http://carbon:5984/ps/ -> ps PID: <0.14439.1841> Status: MR Processed source update #6114 Futon on the destination database shows: Overview: Name: ps Size: 58.3 GB Number of Documents: 6,107 Update Seq: 6,114 Status: Type: Replication Object: 0c52c5: http://sodium:5984/ps/ -> ps PID: <0.234.0> Status: Starting The status on the destination database has been "Starting" since I restarted couchdb on the destination server about 15 minutes ago. Both the source and destination databases are being written to by user processes on an intermittent basis: anywhere from 0 to a few dozen documents per minute, each document with up to a few dozen megabytes of attachments. I see no error entries in either the source or destination server's couch.log. Versions: Couchdb: 1.0.2 OS: Linux 2.6.32 (AMD 64) Why don't I see any documents being added to the destination database?