couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Smith <>
Subject High volume couchDB?
Date Thu, 18 Jun 2009 02:46:29 GMT
I recently ran across this project while doing research into Erlang for High 
Availability, from what I can see, this project may be exactly what I've been 
looking for. 

We've been using a clustered filesystem API for our large php application to 
keep files and related, semi-structured data. It was developed in-house. We see 
perhaps 250,000 file operations per day, with 3 nodes to store data. Think: 
Server-level RAID 1. Total data size is ~ 1 TB, with about 50% growth per 

What I'm looking for: 
1) Ability to store misc files. (Mixed: PDFs, JPGs, iso images, text files, etc) 

2) Ability to store related metadata close by (time stamp, ownership data,
application-specific data, etc) We do this now by having a "sister" file with 
the data in PHP format serialized with a ".mdt" extension. 

3) Redundancy: zero data loss in the event of a server failure. We achieve 
this now by having our own, developed-in-house file server daemon running under 
xinet.d. Conceptually, it's similar to WebDAV, but lighter weight. 

4) Failover: ability to keep working even with partial cluster failure. 

5) Healing: ability to get "back together" when downed servers are restored. 

6) Performance that degrades gracefully: What happens when the screws get put 
to CouchDB? What kinds of loads can it sustain given mid-range hardware? 

7) Backups off-site: Disaster Recovery plans, currently we're using rsync run 

8) Reliability: It should "just work" without needing regular babysitting. 

Am I right in reading that CouchDB accomplishes all/most/many of these goals? 
If not all of them, which would need watching? 


This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message