Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 019F87CE0 for ; Wed, 14 Sep 2011 20:11:11 +0000 (UTC) Received: (qmail 23946 invoked by uid 500); 14 Sep 2011 20:11:09 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 23899 invoked by uid 500); 14 Sep 2011 20:11:09 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 23891 invoked by uid 99); 14 Sep 2011 20:11:09 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Sep 2011 20:11:09 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=FREEMAIL_FROM,FREEMAIL_REPLY,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of mikeal.rogers@gmail.com designates 209.85.210.180 as permitted sender) Received: from [209.85.210.180] (HELO mail-iy0-f180.google.com) (209.85.210.180) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Sep 2011 20:11:04 +0000 Received: by iahk25 with SMTP id k25so685353iah.11 for ; Wed, 14 Sep 2011 13:10:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer; bh=TrP2CK7IcXB1REYAdypbEfgcFFGeuWi82J6a4PYLvOg=; b=h7WWnbIzb/bLsCyG+dK0JojYpqummPK2p6db/DUciYpnYE4zatk5n+gwk0RkO3I5b9 MurFhiWLLX8V5hpdm2F9dGPDu5GqbI4wd5PKlWM6qaELXXIsQU/72KPXG1TwP7XBamY4 GWXwKeJ5nGygMAcfJCPXRiiMuVONftcvkS7ak= Received: by 10.231.82.12 with SMTP id z12mr397434ibk.36.1316031043736; Wed, 14 Sep 2011 13:10:43 -0700 (PDT) Received: from [172.16.32.152] ([204.28.122.20]) by mx.google.com with ESMTPS id v16sm1467812ibe.0.2011.09.14.13.10.42 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 14 Sep 2011 13:10:43 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1250) Subject: Re: CouchDB Crash report db_not_found when attempting to replicate databases From: Mikeal Rogers In-Reply-To: Date: Wed, 14 Sep 2011 13:10:43 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: References: <050C91F2664C44CFA66A262EE4124ADC@gmail.com> To: user@couchdb.apache.org X-Mailer: Apple Mail (2.1250) HAHA! I already forgot that we did this. -Mikeal On Sep 14, 2011, at September 14, 201112:51 PM, Randall Leeds wrote: > On Wed, Sep 14, 2011 at 12:19, Adam Kocoloski = wrote: >=20 >> There's a multipart API which allows for a single PUT request = containing >> the document body as JSON and all its attachments in their raw form. >> Documentation is pretty thin at the moment, and unfortunately I think = it >> doesn't quite allow for a pipe(). Would be really nice if it did, = though. >>=20 >=20 > It does. We figured it out together a couple weeks ago and that's when = this > code came into being. > Requesting a _specific_ revision with ?revs=3Dtrue will give you a > multipart/related response suitable for passing straight into a > ?new_edits=3Dfalse&rev=3D PUT. > See https://github.com/mikeal/replicate/blob/master/main.js#L49 >=20 >=20 >>=20 >> On Wednesday, September 14, 2011 at 1:16 PM, Mikeal Rogers wrote: >>=20 >>> npm is mostly attachments and I haven't seen any issues so far. >>>=20 >>> I wish there was a better way to replicate attachments atomically = for a >> single revision but if there is, I don't know about it. >>>=20 >>> It's probably a huge JSON operation and it sucks, but I don't have = to >> parse it in node.js, I just pipe() the body right along. >>>=20 >>> -Mikeal >>>=20 >>> On Sep 14, 2011, at September 14, 20118:42 AM, Adam Kocoloski wrote: >>>=20 >>>> Hi Mikeal, I just took a quick peek at your code. It looks like you >> handle attachments by inlining all of them into the JSON = representation of >> the document. Does that ever cause problems when dealing with the = ~100 MB >> attachments in the npm repo? >>>>=20 >>>> I've certainly seen my fair share of problems with attachment >> replication in CouchDB 1.0.x. I have a sneaking suspicion that there = are >> latent bugs related to incorrect determinations of Content-Length = under >> various compression scenarios. >>>>=20 >>>> Adam >>>>=20 >>>> On Tuesday, September 13, 2011 at 5:08 PM, Mikeal Rogers wrote: >>>>=20 >>>>> My replicator is fairly young so I think calling it "reliable" = might >> be a little misleading. >>>>>=20 >>>>> It does less, I don't ever attempt to cache the high watermark = (last >> seq written) and start over from there. If the process crashes just = start >> over from scratch. This can lead to a delay after restart but I find = that >> it's much simpler and more reliable on failure. >>>>>=20 >>>>> It's also simpler because it doesn't have to content with being an >> http client and a client of the internal couchdb erlang API. It just = proxies >> requests from one couch to another. >>>>>=20 >>>>> While I'm sure there are bugs that I haven't found yet in it, I = can >> say that it replicates the npm repository quite well and I'm using it = in >> production. >>>>>=20 >>>>> -Mikeal >>>>>=20 >>>>> On Sep 13, 2011, at September 13, 201111:44 AM, Max Ogden wrote: >>>>>=20 >>>>>> Hi Chris, >>>>>>=20 >>>>>> =46rom what I understand the current state of the replicator (as = of >> 1.1) is >>>>>> that for certain types of collections of documents it can be >> somewhat >>>>>> fragile. In the case of the node.js package repository, >> http://npmjs.org, >>>>>> there are many relatively large (~100MB) documents that would >> sometimes >>>>>> throw errors or timeout during replication and crash the >> replicator, at >>>>>> which point the replicator would restart and attempt to pick up >> where it >>>>>> left off. I am not an expert in the internals of the replicator = but >>>>>> apparently the cumulative time required for the replicator to >> repeatedly >>>>>> crash and then subsequently relocate itself in _changes feed in = the >> case of >>>>>> replicating the node package manager was making the built in = couch >>>>>> replicator unusable for the task. >>>>>>=20 >>>>>> Two solutions exist that I know of. There is a new replicator in >> trunk (not >>>>>> to be confused with the _replicator db from 1.1 -- it is still >> using the old >>>>>> replicator algorithms) and there is also a more reliable = replicator >> written >>>>>> in node.js https://github.com/mikeal/replicate that was was >> written >>>>>> specifically to replicate the node package repository between >> hosting >>>>>> providers. >>>>>>=20 >>>>>> Additionally it may be useful if you could describe the >> 'fingerprint' of >>>>>> your documents a bit. How many documents are in the failing >> databases? are >>>>>> the documents large or small? do they have many attachments? how >> large is >>>>>> your _changes feed? >>>>>>=20 >>>>>> Cheers, >>>>>>=20 >>>>>> Max >>>>>>=20 >>>>>> On Tue, Sep 13, 2011 at 11:22 AM, Chris Stockton >>>>>> > )>wrote: >>>>>>=20 >>>>>>> Hello, >>>>>>>=20 >>>>>>> We now have about 150 dbs that are refusing to replicate with >> random >>>>>>> crashes, which provide really zero debug information. The error >> is db >>>>>>> not found, but I know its available. Does anyone know how can I >>>>>>> trouble shoot this? Do we just have to many databases = replicating >> for >>>>>>> couchdb to handle? 4000 is a small number for the massive >> hardware >>>>>>> these are running on. >>>>>>>=20 >>>>>>> -Chris >>=20 >>=20 >>=20