From dev-return-2782-apmail-couchdb-dev-archive=couchdb.apache.org@couchdb.apache.org Fri Feb 20 19:55:53 2009 Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 39207 invoked from network); 20 Feb 2009 19:55:53 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 20 Feb 2009 19:55:53 -0000 Received: (qmail 47399 invoked by uid 500); 20 Feb 2009 19:55:52 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 47301 invoked by uid 500); 20 Feb 2009 19:55:52 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 47289 invoked by uid 99); 20 Feb 2009 19:55:52 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Feb 2009 11:55:52 -0800 X-ASF-Spam-Status: No, hits=0.6 required=10.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL,SUBJECT_FUZZY_TION X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.68.5.16] (HELO relay02.pair.com) (209.68.5.16) by apache.org (qpsmtpd/0.29) with SMTP; Fri, 20 Feb 2009 19:55:44 +0000 Received: (qmail 77908 invoked from network); 20 Feb 2009 19:55:21 -0000 Received: from 96.33.90.152 (HELO ?192.168.1.195?) (96.33.90.152) by relay02.pair.com with SMTP; 20 Feb 2009 19:55:21 -0000 X-pair-Authenticated: 96.33.90.152 Message-Id: <7027C1C3-25CE-4F3A-8736-0A3276D5C404@apache.org> From: Damien Katz To: dev@couchdb.apache.org In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: Partitioned Clusters Date: Fri, 20 Feb 2009 14:55:20 -0500 References: <46aeb24f0902200235x28992465he6ef7e0facfc3f8a@mail.gmail.com> X-Mailer: Apple Mail (2.930.3) X-Virus-Checked: Checked by ClamAV on apache.org On Feb 20, 2009, at 1:55 PM, Stefan Karpinski wrote: > Hi, I thought I'd introduce myself since I'm new here on the couchdb > list. I'm Stefan Karpinski. I've worked in the Monitoring Group at > Akamai, Operations R&D at Citrix Online, and I'm nearly done with a > PhD in computer networking at the moment. So I guess I've thought > about this kind of stuff a bit ;-) > > I'm curious what the motivation behind a tree topology is. Not that > it's not a viable approach, just why that and not a load-balancer in > front of a bunch of "leaves" with lateral propagation between the > leaves? Why should the load-balancing/proxying/caching node even be > running couchdb? > > One reason I can see for a tree topology would be the hierarchical > cache effect. But that would likely only make sense in certain > circumstances. Being able to configure the topology to meet various > needs, rather than enforcing one particular topology makes more sense > to me overall. Trees would be overkill except for with very large clusters. With CouchDB map views, you need to combine results from every node in a big merge sort. If you combine all results at a single node, the single clients ability to simultaneously pull data and sort data from all other nodes may become the bottleneck. So to parallelize, you have multiple nodes doing a merge sort of sub nodes , then sending those results to another node to be combined further, etc. The same with with the reduce views, but instead of a merge sort it's just rereducing results. The natural "shape" of that computation is a tree, with only the final root node at the top being the bottleneck, but now it has to maintain connections and merge the sort values from far fewer nodes. -Damien