incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Herndon" <tphern...@gmail.com>
Subject Re: Evaluating CouchDB
Date Wed, 26 Nov 2008 21:12:11 GMT
Thanks very much, Chris, I greatly appreciate your insight.  I'll keep
you informed on how things work out.

---Peter

On Tue, Nov 25, 2008 at 1:53 PM, Chris Anderson <jchris@apache.org> wrote:
> On Mon, Nov 24, 2008 at 9:24 AM, Peter Herndon <tpherndon@gmail.com> wrote:
>
>>
>> Anyway, that's my current use case, and my next use case.  I know that
>> CouchDB isn't finished yet, and hasn't been optimized yet, but does
>> anyone have any opinions on whether CouchDB would be a reasonable fit
>> for managing the metadata associated with each object?
>
> I think CouchDB is pretty much design with this use case in mind. If
> you were lucky enough to convince the organization to switch from XML
> to JSON, the software would pretty much write itself. And CouchDB does
> a fairly decent job of dealing in XML, as well (using Spidermonkey's
> E4X engine) so that's not even required.
>
>> And, likewise,
>> would CouchDB be a reasonable fit for managing the binary datastreams?
>>  Would it be practical to store the datastreams in CouchDB itself, and
>> up to what size limit/throughput limit?
>
> CouchDB's attachment support is pretty much designed for this use case
> (attachments can be multi-GB files, and aren't sent to view servers).
> From your description, it sounds like you are maxing out IO at the
> network level, so it's hard to say how CouchDB would interact with
> such a stream, without seeing it in action. However, CouchDB's
> replication and distribution capabilities should make managing
> multi-site projects as simple as one can hope for. If you shard
> projects as databases, then you can use replication to make them
> available on the local network for the various sites, which should
> make it easier to avoid load bottlenecks at a central repository.
>
>> Would it be better to store
>> the datastreams externally and use CouchDB to manage the metadata and
>> access control?
>
> It's not clear - obviously importing TBs of data from a filesystem to
> CouchDB will take time and expense, even if CouchDB handles it
> swimmingly. The nice thing about the schemaless documents is that you
> can be flexible going forward, maybe referencing some assets via URIs
> and storing others as attachments.
>
> Also, looking down the road, are there plans for
>> CouchDB's development that would improve its fitness for this purpose
>> in the future?
>>
>
> Your project sounds like a good fit for CouchDB. Of course, you are
> talking about working on the high end of the performance / scalability
> curve, and CouchDB is relatively new, so you'll have to be comfortable
> as a trail-blazer (not that you'd be the only one, but with a new
> technology, you'll be in a smaller crowd than if you used something
> that's been around longer.)
>
> I think the biggest positive reason to use CouchDB for your project is
> the easy of federation / distribution / offline work. Once you've
> built the business-rules and document format around your project and
> CouchDB, booting up other instances of the project for more media
> collections should be straightforward. Because the documents will be
> more self-contained that what you'd have with a SQL store, for
> instance, you could build something amenable to merging multiple
> repositories, or splitting off just a portion of a repository for a
> particular purpose. This flexibility seems like a big win, as it would
> allow you to respond to things like datacenter-level bottlenecks with
> changes that users will understand, such as moving just the necessary
> sub-collections to a more local server.
>
> Good luck and keep us up to date with your progress.
>
> Chris
>
> --
> Chris Anderson
> http://jchris.mfdz.com
>

Mime
View raw message