incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Klo <jim....@sri.com>
Subject Scaling CouchDB
Date Fri, 22 Apr 2011 18:40:21 GMT
I'm part of the core Federal Learning Registry dev team [http://www.learningregistry.org],
and we're using CouchDB to store and replicate contents of the registry within our network.

One of the questions that has come up as we are starting to make plans for our initial production
release is the scalability strategy of CouchDB?  We expect long term, we are going to have
an enormous amount of data from activity streams and metadata inserted into the network, and
I'd like to have an idea what we need to work towards now so theres no big surprise when we
start getting close to hitting some limits.

As part of our infrastructure strategy - we've chosen Amazon Web Services EC2 & EBS as
our hosting provider for the first rollout.  EBS currently has an upper limit of 1TB per volume,
other cloud or non-cloud solutions may have similar or different limitations, however I'm
only concerned right now with how we might deal with this on EC2 and EBS.
1. Are there CouchDB limits that we are going to run into before we hit 1TB?
2. Is there a strategy to for disk spanning to go beyond the 1TB limit by incorporating multiple
volumes or do we need to leverage a solution like BigCouch which seems to require us to spin
up multiple CouchDB's and do some sort of sharding/partitioning of data?  I'm curious on how
queries that span shards/partitions works or if this is transparent.

Thanks,

- Jim


Jim Klo
Senior Software Engineer
Center for Software Engineering
SRI International





Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message