Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 94797 invoked from network); 29 Dec 2009 06:44:43 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 29 Dec 2009 06:44:43 -0000 Received: (qmail 82939 invoked by uid 500); 29 Dec 2009 06:44:41 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 81394 invoked by uid 500); 29 Dec 2009 06:44:38 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 81382 invoked by uid 99); 29 Dec 2009 06:44:37 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Dec 2009 06:44:37 +0000 X-ASF-Spam-Status: No, hits=-2.1 required=5.0 tests=AWL,BAYES_00 X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of tkruthoff@gmail.com designates 209.85.217.225 as permitted sender) Received: from [209.85.217.225] (HELO mail-gx0-f225.google.com) (209.85.217.225) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Dec 2009 06:44:29 +0000 Received: by gxk25 with SMTP id 25so4062030gxk.5 for ; Mon, 28 Dec 2009 22:44:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:from:to :in-reply-to:content-type:content-transfer-encoding:mime-version :subject:date:references:x-mailer; bh=8US5kOvlOqEJyyM2oBnn3YX9QYUGkE+rdLuArYbQy2A=; b=ZMJBGBoj8vHktANLgBWpU9SS6xJRAapf0mho0JZZX3ECm8ifIx3QfUlvWvFiYp3JlG U5PuL10PAoxhHeMeHRVTV+LgEVtBJcAJej6rCTIH/SuyYrODXzv8rRiVaznoAary+E90 ykmV4qJbQw7IDM3BeVeMzOPWhMZPXgd0WfRb0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:from:to:in-reply-to:content-type :content-transfer-encoding:mime-version:subject:date:references :x-mailer; b=ASOdxKppH+ApyYRFKw5sCln6xmPyLRu8iEdsTSx63PBe+KqFmblilKogDtpQ9lTbKs 1imXmpG+O6bSMN9Jsh36hLmXPAIjKH5BugOIPfgz7W/bcKXt/bcLn7bf8TI0MjfPetn5 //2oqyYA8JQtSdcI+SXq59j45iF7mXzBT5tLQ= Received: by 10.150.48.40 with SMTP id v40mr24105743ybv.142.1262069048254; Mon, 28 Dec 2009 22:44:08 -0800 (PST) Received: from ?10.0.1.60? (c-24-10-14-138.hsd1.ca.comcast.net [24.10.14.138]) by mx.google.com with ESMTPS id 7sm3934256ywc.21.2009.12.28.22.44.06 (version=TLSv1/SSLv3 cipher=RC4-MD5); Mon, 28 Dec 2009 22:44:07 -0800 (PST) Message-Id: <3D7B95B8-F4CE-48E0-A089-EEA0EE8286CC@gmail.com> From: Troy Kruthoff To: user@couchdb.apache.org In-Reply-To: <21a5a18d0912282209ic072c3flbef36c821347ea37@mail.gmail.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v936) Subject: Re: Designing a database Date: Mon, 28 Dec 2009 22:44:04 -0800 References: <21a5a18d0912282209ic072c3flbef36c821347ea37@mail.gmail.com> X-Mailer: Apple Mail (2.936) I made an app that does exactly what you described, although it was a "for-fun" hack just to showcase couch to my buddies (it was called tweetmesexy, so you can imagine how much fun it actually was)... What I ended up doing was: 1) Each new comment is a new doc that references the tweet id 2) Use view collation to get the tweet(s) and comments via single http call (http://wiki.apache.org/couchdb/View_collation) 3) Run a script (via cron or whatever) to move the comments to the tweet (and delete the comments) when the tweet is no longer "hot". This is not required, but in our case it allowed us to do some nifty analytics thanks to couch's incremental map/reduce As for if couch is a good fit or "update-heavy" applications, I think an RDBMS has advantages in a true "update" scenario (like 'update stats set counter=counter+1'). But remember, you are only using the word "update" because couch's awesomeness allows you to even consider storing the comments inline with the doc. Technically you can do the same with an SQL database, using a serialized blob and have the same conflict issues (without built-in revision love). So assuming I'm correct that the structure of your data will be similar if using a SQL database or couch, you would be well served with couch: 1) You can archive the comments inline, as I mentioned above and run cool map/reduce on the tweet and comments together 2) Simple master-master, allowing you to scale writes to your heart's content 3) With SQL you'll need multiple queries (or go the ugly join route) to get the comments and the tweet, vs a single http call Bottom line, just because you find yourself structuring your data like you would in an SQL database, does not negate the other advantages of couch. Troy On Dec 28, 2009, at 10:09 PM, Sean Clark Hess wrote: > Our system will have comments related to live data - imagine people > commenting on tweets right after they are written. > > I'm having trouble deciding how to model it. It makes a lot of sense > to make > one document containing all the comments for each data segment, but > we could > theoretically have hundreds of users commenting on the same segment at > once. > > Would data consistency become a nightmare? With an RDBMS you would > have a > comments table, and insert a new row for each comment - preventing > conflicts. I could do the same thing with couch, by adding a separate > document for each comment, but it seems to violate a fundamental > principle > of couch. > > Is Couch DB a bad fit for an update-heavy system? Updates will only > be heavy > within the first minute or so after the data is released, then it will > switch to a very read-heavy system. > > Thanks for your help