couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier>
Subject Re: Not-even-yet-newbie question
Date Sun, 19 Apr 2009 12:51:06 GMT
Brian Candler wrote:

Brian, thanks extremely for the caveats you provided.
I'd rather get into this with eyes wide open..

Sorry for asking this, but since I don't yet know who's who here..

> 1. The couchdb database is a single append-only file. Your filesystem needs
> to support huge files.
Can someone else comment on the above ?
Does this really mean that if we have, for one customer, 100,000 CouchDB 
"documents" consisting each of minimal meta-data, but with each an 
attachment that is on average a 3 MB PDF file, that this is all stored 
in one single 300 GB file ?

If yes, is that not uncomfortable/scary ?
(I mean, even nowadays, moving a 300 GB file is not the easiest 
practical thing to do).

Is there anything that allows to control this ?

> 2. Once you get up to terabytes of documents, it may become impractical
> and/or too slow to compact the database, which involves reading the entire
> database from start to end and writing a completely new copy.

> In your case it sounds like you normally just append documents and leave
> them there forever. 
Not 100%, but overwhelmingly so.

However, suppose you have a customer who leaves, and is
> no longer paying you for the half terabyte of storage they are using?
Good practical observation.  Due to our excellent service, that does not 
happen very often of course.  But we do have the occasional real-estate 
broker or bank among our customers..

> another who, for legal reasons, requires a document to be purged? (Deleting
> a document in couchdb just marks it as deleted; it can still be retrieved
> until a compaction has been done.)
In fact this rather mimics our current system.  Customers tend to not 
remember when they delete a document themselves, and tend to accuse the 
computers of losing it.

> I would suggest that the easiest away round these problems - and also a good
> way to improve security - is to have a separate couchdb database for each
> customer. 
That also mimics our current layout.

This still only requires running a single instance of the couchdb
> server.
Good to know.

Brian, I really appreciate this information.  I would probably have 
found this out as I investigate CouchDB further, but it would have taken 
me a lot more time.

View raw message