couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From niall el-assaad <nial...@gmail.com>
Subject Re: How many nodes can couchdb scale to?
Date Mon, 28 Feb 2011 08:39:44 GMT
Hi Isaac,

Thanks for the reply. I've put some more info inline:

>
>
> * Replication topology: is the plan to have replication from the branch
> office
>  nodes to your centralized data center? (n:1)
>

We would have a single box at each branch office replicating with two boxes
at the data centre (for resilience).
The data centre boxes would then replicate with each other.

So if some data was inserted at the remote branch, it would be replicated to
the data centre, and the data centre would replicate it to the other 1999
branches.


> * Replication type: continuous or triggered manually/programatically?
>

The ideal would be continuous.


> * Scope of data set: I would be more concerned with writes than reads.
>  You'll need to have an idea of what your current aggregate average and
>  peak writes per second are, how much data is written for a given
>  period of time, and how far you think you will need this rate to scale in
>  the future.
>

I would expect the average to be around 10 writes per second, with the peak
at about 100 writes per second.


> * Why Couch: is CouchDB going to be addressing a brand new need, or
>  is it going to replace existing systems for known reasons? If it's the
>  latter, what is it about your current systems that aren't meeting your
>  demands, and what do you hope Couch will provide that will fill the gap?
>  (Specifically looking for performance data that you might have already
>  collected, and if Couch is going to be living on your existing hardware
>  or new hardware.)
>

Its for a completely new project, the main driver for looking at CouchDB is
the ability to have a very large scale cluster with write capabilities in
each branch. Mainly so if their is a failure to communications between the
branch and the data centre everything can continue to work, then sync up
later.

>
> I haven't dealt with large distributed Couch systems, but my instinct
> would be that Couch wouldn't have any problem with a 2000:1 replicated
> system. (See Ubuntu One as an example of a large CouchDB system with
> many external replicators.) The ability to handle it would come down
> to how well the aggregate data set matches the size of hardware and
> replication layout in your data center, and of course available
> ingress bandwidth.
>

Understand, we would scale the hardware and bandwidth accordingly based upon
testing of the application.

thanks,

niall

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message