incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Van Couvering <da...@vancouvering.com>
Subject CouchDB and clustering
Date Thu, 26 Feb 2009 17:48:56 GMT
Hi, all.  As I am looking more into CouchDB, I am realizing that there is a
misconception I had (or think I have), and I just want to clarify:

CouchDB is distributed in nature in that it allows for bi-directional
asynchronous replication even under conditions where the network is
reliable.  So as Jens just said on the user list, "CouchDB could be used to
implement a database with millions of server nodes all around the world".

But if I understand things correctly, a given CouchDB database must be able
to reside completely on a single node.  In the current implementation, there
is no real clustering support.  The features that I would expected in a
distributed storage system include (but probably this is not a complete
list):

- Partitioning/sharding of data using some kind of consistent hash on the
key
- Efficient failure detection and failover
- A "single image" view of the cluster from the perspective of the client
API
- An easy to use management interface for managing the cluster (status view
and notification, adding and removing nodes, online upgrade, etc.)

Watching the mail go by, it appears that this is something that has been
thought of and considered in the architecture, but not yet implemented.

I have seem some discussions about adding sharding support and a single
image view, but not much about efficient failure detection or a management
infrastructure and API.

First of all - is my perception correct or am I missing something?  I often
do...

Second - I think they are, but I wanted to confirm - are these things
planned but just the community hasn't had time to address them yet?  What
priority are they taking right now, or are there other fish to fry?

The reason I ask is, it appears the advantages of CouchDB *right now* is
about a highly robust, flexible and high throughput read-mostly data
store.   The other advantage is that the API is easy, approachable, and
web-ready, unlike many stores out there.  I love the pre-compiled views that
allow you to have highly efficient slice-and-dice views into your
documents.  I also think it's an excellent base for a peer-to-peer
replicated data store, allowing people to collaborate over the Internet
without requiring a centralized server (although I am concerned about how
easy it is for a Mere Mortal to install a CouchDB-based app on their
computer).

But what it's not ready for is to give you an out-of-the box clustered
storage solution.  Maybe later, but not now.

Is that about right?

Thanks!

David
-- 
David W. Van Couvering

I am looking for a senior position working on server-side Java systems.
 Feel free to contact me if you know of any opportunities.

http://www.linkedin.com/in/davidvc
http://davidvancouvering.blogspot.com
http://twitter.com/dcouvering

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message