incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject Re: Not-even-yet-newbie question
Date Sun, 19 Apr 2009 13:35:40 GMT

On 19 Apr 2009, at 08:51, André Warnier wrote:

> Brian Candler wrote:
> [...]
>
> Brian, thanks extremely for the caveats you provided.
> I'd rather get into this with eyes wide open..
>
> Sorry for asking this, but since I don't yet know who's who here..
>
>> 1. The couchdb database is a single append-only file. Your  
>> filesystem needs
>> to support huge files.
> Can someone else comment on the above ?
> Does this really mean that if we have, for one customer, 100,000  
> CouchDB "documents" consisting each of minimal meta-data, but with  
> each an attachment that is on average a 3 MB PDF file, that this is  
> all stored in one single 300 GB file ?
>
> 1a.
> If yes, is that not uncomfortable/scary ?

Yes, this is correct.

> (I mean, even nowadays, moving a 300 GB file is not the easiest  
> practical thing to do).

Yes, if you get this use-case, you might not only want a DB per user  
but a DB per user per day.


> 2a.
> Is there anything that allows to control this ?

see above.


Cheers
Jan
--


>> 2. Once you get up to terabytes of documents, it may become  
>> impractical
>> and/or too slow to compact the database, which involves reading the  
>> entire
>> database from start to end and writing a completely new copy.
> Agreed.
>
>> In your case it sounds like you normally just append documents and  
>> leave
>> them there forever.
> Not 100%, but overwhelmingly so.
>
> However, suppose you have a customer who leaves, and is
>> no longer paying you for the half terabyte of storage they are using?
> Good practical observation.  Due to our excellent service, that does  
> not happen very often of course.  But we do have the occasional real- 
> estate broker or bank among our customers..
> ;-)
>
> Or
>> another who, for legal reasons, requires a document to be purged?  
>> (Deleting
>> a document in couchdb just marks it as deleted; it can still be  
>> retrieved
>> until a compaction has been done.)
> In fact this rather mimics our current system.  Customers tend to  
> not remember when they delete a document themselves, and tend to  
> accuse the computers of losing it.
>
>> I would suggest that the easiest away round these problems - and  
>> also a good
>> way to improve security - is to have a separate couchdb database  
>> for each
>> customer.
> That also mimics our current layout.
>
> This still only requires running a single instance of the couchdb
>> server.
> Good to know.
>
> Brian, I really appreciate this information.  I would probably have  
> found this out as I investigate CouchDB further, but it would have  
> taken me a lot more time.
>
>


Mime
View raw message