Return-Path: Delivered-To: apmail-incubator-couchdb-user-archive@locus.apache.org Received: (qmail 31474 invoked from network); 25 Nov 2008 18:53:50 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 25 Nov 2008 18:53:50 -0000 Received: (qmail 24052 invoked by uid 500); 25 Nov 2008 18:54:00 -0000 Delivered-To: apmail-incubator-couchdb-user-archive@incubator.apache.org Received: (qmail 23620 invoked by uid 500); 25 Nov 2008 18:53:59 -0000 Mailing-List: contact couchdb-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: couchdb-user@incubator.apache.org Delivered-To: mailing list couchdb-user@incubator.apache.org Received: (qmail 23609 invoked by uid 99); 25 Nov 2008 18:53:59 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Nov 2008 10:53:59 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jchris@gmail.com designates 74.125.92.148 as permitted sender) Received: from [74.125.92.148] (HELO qw-out-1920.google.com) (74.125.92.148) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Nov 2008 18:52:31 +0000 Received: by qw-out-1920.google.com with SMTP id 4so44058qwk.54 for ; Tue, 25 Nov 2008 10:53:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:sender :to:subject:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references :x-google-sender-auth; bh=utybwTrRGo9AWY6Dv/r9ucsr6v2DDAJ7STdMnFsNZ+s=; b=UUBmUa7cFDGOYha7hqSD8PEhADcud9TFmYJ9Szzwli3aydStMi56v8CRphPnUXAGKl wctCs1394b9eIuFZZ64I5BsW00qFXzpVZUrXZ2HOQEQvn9HGBTiePAueLKe2TD3J3vx8 OFazw4L2DCeqJYGtJV2KiNn/88FYaQ27Vg7pg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:sender:to:subject:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references:x-google-sender-auth; b=C1mdU6MmdvltR2e9J6lYc9gv5vyemB4/PoApuN4i9A7fF4Ym9MGkP6iOa9OP5FLOdB djVhCCa0NMgNiKojNa2JZwu60Dl1zH6V5hQwHiQ0k7n2qRpO2VC0E7A6w/jN3njPZxB4 5I6izdl9qxHNQfFTTijjiewJXHLGrblkbaMjw= Received: by 10.65.150.2 with SMTP id c2mr5075358qbo.32.1227639196490; Tue, 25 Nov 2008 10:53:16 -0800 (PST) Received: by 10.64.148.13 with HTTP; Tue, 25 Nov 2008 10:53:16 -0800 (PST) Message-ID: Date: Tue, 25 Nov 2008 10:53:16 -0800 From: "Chris Anderson" Sender: jchris@gmail.com To: couchdb-user@incubator.apache.org Subject: Re: Evaluating CouchDB In-Reply-To: <3a48a11f0811240924r38e0f5aeo73f1fc28449a420@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <3a48a11f0811240924r38e0f5aeo73f1fc28449a420@mail.gmail.com> X-Google-Sender-Auth: 40ee2fe9bcdf07b4 X-Virus-Checked: Checked by ClamAV on apache.org On Mon, Nov 24, 2008 at 9:24 AM, Peter Herndon wrote: > > Anyway, that's my current use case, and my next use case. I know that > CouchDB isn't finished yet, and hasn't been optimized yet, but does > anyone have any opinions on whether CouchDB would be a reasonable fit > for managing the metadata associated with each object? I think CouchDB is pretty much design with this use case in mind. If you were lucky enough to convince the organization to switch from XML to JSON, the software would pretty much write itself. And CouchDB does a fairly decent job of dealing in XML, as well (using Spidermonkey's E4X engine) so that's not even required. > And, likewise, > would CouchDB be a reasonable fit for managing the binary datastreams? > Would it be practical to store the datastreams in CouchDB itself, and > up to what size limit/throughput limit? CouchDB's attachment support is pretty much designed for this use case (attachments can be multi-GB files, and aren't sent to view servers). >From your description, it sounds like you are maxing out IO at the network level, so it's hard to say how CouchDB would interact with such a stream, without seeing it in action. However, CouchDB's replication and distribution capabilities should make managing multi-site projects as simple as one can hope for. If you shard projects as databases, then you can use replication to make them available on the local network for the various sites, which should make it easier to avoid load bottlenecks at a central repository. > Would it be better to store > the datastreams externally and use CouchDB to manage the metadata and > access control? It's not clear - obviously importing TBs of data from a filesystem to CouchDB will take time and expense, even if CouchDB handles it swimmingly. The nice thing about the schemaless documents is that you can be flexible going forward, maybe referencing some assets via URIs and storing others as attachments. Also, looking down the road, are there plans for > CouchDB's development that would improve its fitness for this purpose > in the future? > Your project sounds like a good fit for CouchDB. Of course, you are talking about working on the high end of the performance / scalability curve, and CouchDB is relatively new, so you'll have to be comfortable as a trail-blazer (not that you'd be the only one, but with a new technology, you'll be in a smaller crowd than if you used something that's been around longer.) I think the biggest positive reason to use CouchDB for your project is the easy of federation / distribution / offline work. Once you've built the business-rules and document format around your project and CouchDB, booting up other instances of the project for more media collections should be straightforward. Because the documents will be more self-contained that what you'd have with a SQL store, for instance, you could build something amenable to merging multiple repositories, or splitting off just a portion of a repository for a particular purpose. This flexibility seems like a big win, as it would allow you to respond to things like datacenter-level bottlenecks with changes that users will understand, such as moving just the necessary sub-collections to a more local server. Good luck and keep us up to date with your progress. Chris -- Chris Anderson http://jchris.mfdz.com