From user-return-7103-apmail-couchdb-user-archive=couchdb.apache.org@couchdb.apache.org Mon Oct 26 17:06:10 2009 Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 31173 invoked from network); 26 Oct 2009 17:06:10 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 26 Oct 2009 17:06:10 -0000 Received: (qmail 25288 invoked by uid 500); 26 Oct 2009 17:06:08 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 25207 invoked by uid 500); 26 Oct 2009 17:06:08 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 25197 invoked by uid 99); 26 Oct 2009 17:06:08 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Oct 2009 17:06:08 +0000 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=FS_REPLICA,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of mfidelman@meetinghouse.net designates 207.154.13.48 as permitted sender) Received: from [207.154.13.48] (HELO server1.neighborhoods.net) (207.154.13.48) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Oct 2009 17:05:59 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by server1.neighborhoods.net (Postfix) with ESMTP id 2960038C31B for ; Mon, 26 Oct 2009 13:05:38 -0400 (EDT) X-Virus-Scanned: by amavisd-new-2.6.2 (20081215) (Debian) at neighborhoods.net Received: from server1.neighborhoods.net ([127.0.0.1]) by localhost (server1.neighborhoods.net [127.0.0.1]) (amavisd-new, port 10024) with LMTP id XZYhXo8KrjKq for ; Mon, 26 Oct 2009 13:05:37 -0400 (EDT) Received: from new-host.home (pool-173-76-225-163.bstnma.fios.verizon.net [173.76.225.163]) by server1.neighborhoods.net (Postfix) with ESMTPSA id E189238C303 for ; Mon, 26 Oct 2009 13:05:36 -0400 (EDT) Message-ID: <4AE5D6E0.6090902@meetinghouse.net> Date: Mon, 26 Oct 2009 13:05:36 -0400 From: Miles Fidelman User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.23) Gecko/20090823 SeaMonkey/1.1.18 MIME-Version: 1.0 To: user@couchdb.apache.org Subject: Re: massive replication? References: <4AE2028F.9030502@meetinghouse.net> <0C6A665F-53C0-41ED-9A8E-C00098B45174@apache.org> <4AE20CD6.7010500@meetinghouse.net> <4AE2247B.1010803@meetinghouse.net> <4AE5B610.90504@meetinghouse.net> <4AE5C134.7000108@meetinghouse.net> <05DD887F-C949-4A06-8A72-0ADBECEA05EB@apache.org> In-Reply-To: <05DD887F-C949-4A06-8A72-0ADBECEA05EB@apache.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Adam Kocoloski wrote: > On Oct 26, 2009, at 11:35 AM, Chris Anderson wrote: > >> >>>> >>>> 2) When these CouchDB servers drop off for an extended period and then >>>> rejoin, how do they subscribe to the update feed from the >>>> replication bus at >>>> a particular sequence? This is really the key element of the >>>> setup. When I >>>> think of multicasting I think of video feeds and such, where if you >>>> drop off >>>> and rejoin you don't care about the old stuff you missed. That's >>>> not the >>>> case here. Does the bus store all this old feed data? >>> >>> Think of something like RSS, but with distributed infrastructure. >>> A node would publish an update to a specific address (e.g., like >>> publishing >>> an RSS feed). >>> >>> All nodes would subscribe to the feed, and receive new messages in >>> sequence. >>> When picking up updates, you ask for everything after a particular >>> sequence >>> number. The update service maintains the data. >> >> The best candidate for an update service like this is probably a >> CouchDB. > > Sounds that way to me, too, although that could be because CouchDB is > the hammer I know really well. > > I'm still trying to figure out how multicast fits into the picture. I > can see it really helping to reduce bandwidth and server load in a > case where the nodes are all expected to be online 100% of the time, > but if nodes are coming and going they're likely to be requesting > feeds at different starting sequences much of the time. What's the > win in that case? > Doesn't seem that way to me. At the very least, for a fully distributed design (which is what we're seeking), this would require a backbone of multiple CouchDB instances plus a management infrastructure of some sort. What I'm looking for is a way to avoid: 1. any kind of central node 2. the need to manage an unbounded number of 1-1 links between nodes That requires some kind of many-many protocol that takes care of moving messages around. Miles -- In theory, there is no difference between theory and practice. In practice, there is. .... Yogi Berra