incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Anderson <jch...@apache.org>
Subject Re: Two Concerns
Date Thu, 31 Dec 2009 05:19:10 GMT
On Wed, Dec 30, 2009 at 8:57 PM, Sean Hess <seanhess@gmail.com> wrote:
> Ok, so let's see if I got it.
>
> The HTTP socket overhead isn't a big deal. Replication is likewise
> easy if you design your database with it in mind.
>

Check

> In terms of focus, couch is looking to redefine the web. It's original
> design makes the embedded stuff easy, so it doesn't distract.
>

Check

> Current performance and scalability is appropriate for small to medium
> apps, but isnt yet ready for huge scale?

Almost.

I'd say CouchDB isn't ready for fire-and-forget scalability, but the
kinds of apps that tend to have data sizes and throughput that would
put single-node CouchDB to the test, also tend to have operations
teams that would be capable of scaling it.

Cloudant is running multi-TB databases. BBC's cluster spans
datacenters. [1] Meebo is running a 64-shard CouchDB Lounge. [2] Even
on a single node, 100+ GB of data in millions of documents is
acceptably performant. [3]

[1] http://video.yahoo.com/watch/5450830/14348068
[2] http://tilgovi.github.com/couchdb-lounge/
[3] http://www.lixo.org/archives/2008/11/02/announcing-lotsofwordscom/

Chris

>
> Thanks again for your responses
>
> On Dec 30, 2009, at 9:37 PM, David Van Couvering <david.vancouvering@gmail.com
>  > wrote:
>
>> Hey, I'm totally gung-ho about your crazy world-changing application
>> architecture vision - I had similar flights of fancy when working
>> with Java
>> DB as an embedded database in the browser - you guys have just taken
>> it to
>> the next level.  Here's what I like about it:
>>
>> * Easy to start small and grow as you go
>> * Makes the server side do what it should always have done: just
>> provide
>> services, not serve up UI - crazy stuff
>> * Allows for offline Internet applications
>> * Allows for data portability
>> * Allows for communities without losing privacy
>>
>> I'll keep an eye on the Cloudant stuff, that looks interesting.  I'm
>> also
>> tempted to help with the dearth of CouchApp docs and tutorials, but
>> I've
>> learned the hard way that if I'm not careful with my time, I end up
>> squeezing blood out of a beet, so we'll just have to see.
>>
>> All the best,
>>
>> David
>>
>> On Wed, Dec 30, 2009 at 7:31 PM, Chris Anderson <jchris@apache.org>
>> wrote:
>>
>>> On Wed, Dec 30, 2009 at 6:55 PM, David Van Couvering
>>> <david.vancouvering@gmail.com> wrote:
>>>> Chris, is the CouchDB "vision" and focus going to be more on the
>>>> localhost/CouchApp type of solution, or more on a robust, scalable
>>>> distributed document database?  My sense is the former, even
>>>> though I
>>> think
>>>> CouchDB has a lot of advantages in the latter.
>>>
>>> No question about it, CouchDB has it's sights set on being able to
>>> run
>>> at datacenter-scale. Cloudant already has an implementation of
>>> partitioning (based on Cliff Moon's Dynomite) which can handle
>>> big-data / high-throughput by clustering across many machines. I
>>> can't
>>> speak for them, but they've expressed interest in rolling this back
>>> into the Apache tree, and I imagine a lot of this work will happen in
>>> the coming months.
>>>
>>> Without being able to run at datacenter-scale, it won't matter much
>>> if
>>> we nailed the localhost stuff. P2P replication is neat and all, but
>>> without being able to handle the cases where data sizes and
>>> request-rates go through the roof, Couch wouldn't be useful for
>>> real-world apps.
>>>
>>> In fact, most of our current users are more interested in big-data,
>>> and our API has been designed from the ground up to support those
>>> cases (eg, no multi-doc transactions or validations, map/reduce,
>>> cacheability, etc).
>>>
>>> I am personally excited about the localhost stuff, because no one
>>> else
>>> is really thinking about it in the way that Couch is. I think when
>>> we've pulled it off, we'll have fundamentally changed the web
>>> architecture. But in the mean time, we also need to focus on
>>> scale-out.
>>>
>>>
>>>>
>>>> In particular, CouchDB is easy to understand, easy to set up, easy
>>>> to
>>> use,
>>>> is free, and has strong community support.  None of the other
>>>> distributed
>>>> solutions have all those advantages, be it sharded MySQL, Hadoop,
>>>> Neo4j
>>> or
>>>> Cassandra.
>>>>
>>>> Is there room for CouchDB to go in both directions, or should
>>>> those of us
>>>> looking for a good distributed DB solution be looking elsewhere?
>>>>
>>>
>>> CouchDB's API is designed to handle giant data. I don't trumpet that
>>> much because it's not personally that exciting to me (scale is a
>>> problem you can throw resources at, building new programming
>>> paradigms
>>> requires the dedication to stick with it even when everyone thinks
>>> you're crazy.) The Web wasn't designed to solve scaling problems, its
>>> success at scaling was a side-effect of its simplicity.
>>>
>>> Thanks for the thoughtful exchange.
>>>
>>> Chris
>>>
>>>> Thanks!
>>>>
>>>> David
>>>>
>>>> On Wed, Dec 30, 2009 at 5:07 PM, Chris Anderson <jchris@apache.org>
>>> wrote:
>>>>
>>>>> On Wed, Dec 30, 2009 at 3:51 PM, Sean Clark Hess <seanhess@gmail.com
>>>>> >
>>>>> wrote:
>>>>>> Thanks Tim.
>>>>>>
>>>>>> One more thing I thought of. I don't remember having this
>>>>>> impression
>>>>> before,
>>>>>> but as I read the Oreilly Book, it seems that the idea of running
>>> Couch
>>>>> on
>>>>>> devices and local computers is a major feature. There are many
>>> features
>>>>>> designed to make CouchDB able to function without middleware.
>>>>>>
>>>>>> My question is: why? That's what middleware is for...
>>>>>
>>>>> The answer is not that Couch is trying to do everything. Really, we
>>>>> have one thing we do extremely well, and that is robust JSON
>>>>> storage
>>>>> with p2p replication, accessed over HTTP.
>>>>>
>>>>> Because we do that already, the ability to serve apps directly from
>>>>> the Couch to the browser is low-hanging-fruit. It is also the
>>>>> best way
>>>>> to take advantage of replication. Other NoSQL stores may be able to
>>>>> rival our capabilities as a scalable database, but no database
>>>>> running
>>>>> in a remote datacenter will be as fast for users as a Couch
>>>>> running on
>>>>> localhost.
>>>>>
>>>>> Middleware is fine even for apps that run at the edge, but if
>>>>> your app
>>>>> requires middleware, that is yet another thing that will need to be
>>>>> installed on the users's machine. CouchDB's eventual goal is to
>>>>> part
>>>>> of the standard desktop stack -- just another feature of web
>>>>> browsers.
>>>>>
>>>>> It is this vision that led to the code that supports HTML
>>>>> rendering.
>>>>> Ajax apps are nearly good enough for most cases, but fall down
>>>>> badly
>>>>> when accessibility and searchability come into play. Also,
>>>>> link-following is an essential part of the REST architecture, and
>>>>> it
>>>>> is absent from a JSON-only interface.
>>>>>
>>>>> There are apps which can be written against the local Couch and
>>>>> browser stack, that can't be written any other way.
>>>>>
>>>>> If you aren't trying to write one of those, the CouchApp feature
>>>>> set
>>>>> is still useful. See for instance the people using _list to filter
>>>>> view responses according to user authentication information, or
>>>>> _update to provide update-in-place like semantics.
>>>>>
>>>>> I hope this helps to answer your question.
>>>>>
>>>>> Chris
>>>>>
>>>>>
>>>>> --
>>>>> Chris Anderson
>>>>> http://jchrisa.net
>>>>> http://couch.io
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> David W. Van Couvering
>>>>
>>>> http://www.linkedin.com/in/davidvc
>>>> http://davidvancouvering.blogspot.com
>>>> http://twitter.com/dcouvering
>>>>
>>>
>>>
>>>
>>> --
>>> Chris Anderson
>>> http://jchrisa.net
>>> http://couch.io
>>>
>>
>>
>>
>> --
>> David W. Van Couvering
>>
>> http://www.linkedin.com/in/davidvc
>> http://davidvancouvering.blogspot.com
>> http://twitter.com/dcouvering
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Mime
View raw message