incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Mitchell <binar...@gmail.com>
Subject Re: Limit on the number of databases?
Date Thu, 26 May 2011 15:21:43 GMT


On Thursday, May 26, 2011 at 6:22 AM, Glenn Bech wrote:

> Hi,
> 
> I just want to ask if there are limits on the number of databases in Couch.
> I am playing around with embeded Couch on Android and are thinking in the
> line of having
> one database per user, and use replication to push data from the client to
> the server. This will provde for an Excellent "offline" user experience.
> 
> This will of course not work if Couch does not handle unlimited datbases
> very well performance- or otherwise.
> 
> Does this sound like a feasable design solution?
> 
> Regards,
> 
> Glenn
 I've done some testing and there are a couple things to keep in mind.

First of all, CouchDB relies directly on the scalability of your filesystem. Having one database
in CouchDB means you also have at least one file for each of those. Since CouchDB currently
stores them all in one directory, you'll need to make sure you select a filesystem that can
handle your expected scale appropriately (many filesystems should be fine in the millions
of files level, but characteristics can differ so do test this). 

Another problem, one which I don't have an immediate answer for is backup. While you could
claim replication is enough for this, I'd say it isn't. The event you need backups for also
cover events like maliciously destroyed or manipulated data or simply the existence of bugs.
I'd rather not trust my data never get screwed up. by the code that accesses it. Many backup
systems are designed around a small number of files. Being able to rollback to a point in
time with millions of files could be an extremely painful process. (I have ideas on how to
solve this but it's still not an easy problem.)

Last but not least, consider the number of active databases you'll need at any single time.
This can be split across many machines of course but it still adds up quickly. Open file descriptors
are great but not if you have to close and then reopen them all the time. A carefully tuned
VM can manage many thousands w/o a problem but I wouldn't push this too much higher. So if
you have 15 machines and 30k active users for any single 1 minute window, that would be 2k
files open and active per machine.

Brian. 
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message