Return-Path: Delivered-To: apmail-incubator-couchdb-user-archive@locus.apache.org Received: (qmail 71572 invoked from network); 16 Jul 2008 22:58:49 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 16 Jul 2008 22:58:49 -0000 Received: (qmail 1134 invoked by uid 500); 16 Jul 2008 22:58:49 -0000 Delivered-To: apmail-incubator-couchdb-user-archive@incubator.apache.org Received: (qmail 1096 invoked by uid 500); 16 Jul 2008 22:58:49 -0000 Mailing-List: contact couchdb-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: couchdb-user@incubator.apache.org Delivered-To: mailing list couchdb-user@incubator.apache.org Received: (qmail 1084 invoked by uid 99); 16 Jul 2008 22:58:49 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Jul 2008 15:58:49 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of dking@ketralnis.com designates 68.183.67.83 as permitted sender) Received: from [68.183.67.83] (HELO ketralnis.com) (68.183.67.83) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Jul 2008 22:57:52 +0000 Received: from [192.168.1.86] ([209.233.192.198]) (authenticated bits=0) by ketralnis.com (8.14.2/8.14.2) with ESMTP id m6GMuw9v077529 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO) for ; Wed, 16 Jul 2008 15:56:59 -0700 (PDT) (envelope-from dking@ketralnis.com) Message-Id: From: David King To: couchdb-user@incubator.apache.org In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v926) Subject: Re: Practical storage limit Date: Wed, 16 Jul 2008 15:56:53 -0700 References: X-Mailer: Apple Mail (2.926) X-Virus-Checked: Checked by ClamAV on apache.org > We'd love to hear what you come up with and also to solve any > problems you might encounter on your way. Please let us know. Please > note that CouchDB at this point is not optimised. We are still in > the 'getting it right' phase before we come to the 'getting it > fast'. That said, CouchDB is plenty fast already, but there is also > the potential to greatly speed up things. So I'm trying a smaller version of this first (9 million records), and I've hit a snag. I have some rather simple python code to read from Postgres and write to couchdb (that uses couchdb-python, where 'db' is a couchdb.client.Database object): chunker = IteratorChunker(get_stuff()) while not chunker.done: print "fetching" chunk = chunker.next_chunk(1000) if chunk: print "Adding %d items, starting with %s" % (len(chunk),chunk[0]['_id']) db.update(chunk) db.update(docs) (see , line 360) uses the bulk API, like: data = self.resource.post('_bulk_docs', content={'docs': documents}) At apparently random points throughout this process, but almost always before 15,000 records or so, the process dies with an exception, the tail end of which looks like: File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/httplib.py", line 707, in send self.sock.sendall(str) File "", line 1, in sendall socket.error: (54, 'Connection reset by peer') If I have Futon up while it's running, I occasionally get a Javascript error along the lines of "killed" (reproducing it is difficult) at the same time. I could have it catch the reset connection and re-try, but why would this be happening?