couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From CGS <cgsmcml...@gmail.com>
Subject Re: same attachment across documents / databases
Date Fri, 28 Oct 2011 11:37:56 GMT
Gregor, your approach makes perfect sense, only that you need some work 
to do because:
1. the attachments are encoded in CouchDB;
2. you will need a document scanner.

I don't know about your project, but I would go with attachments on a 
different pipe managed by a web server and pointed in CouchDB documents 
(with maximum a document per attachment to manage the attachment 
description and use of include_docs). Now, it's up to you because you 
know better your project requests.

CGS



On 10/28/2011 01:25 PM, Robert Newson wrote:
> The approach would be to teach couchdb how to deduplicate
> byte-identical attachments (or chunks thereof) with a file. Sounds a
> bit tricky but not impossible.
>
> B.
>
> On 28 October 2011 12:22, Gregor Martynus<gregor@martynus.net>  wrote:
>> Thanks for your responses!
>>
>> I'm not sure if there is any approach to go minimize the disadvantage of
>> replicated attachments eating up space and performance, if there is, please
>> let me know.
>>
>> My approach would be to setup a backend server that listens to new
>> attachments coming in, transferring these to an external store like S3 and
>> then replace the doc attachment in the DB with some kind of pointer to the
>> new location of the attachments.
>>
>> Not sure if that makes sense, I'm open for suggestions.
>>
>> And once more thanks for your help!
>>
>> On Fri, Oct 28, 2011 at 1:14 PM, CGS<cgsmcmlxxv@gmail.com>  wrote:
>>
>>> Hi Gregor,
>>>
>>> I might be wrong because I am no expert in that field. But from the
>>> documentation, one can deduce that all the attachments are inserted into the
>>> document and not pointing toward a physical file (quite logic if you
>>> consider the main purpose of CouchDB: web-oriented database). As replication
>>> mechanism is the same for local replication and replication over the network
>>> (just transferring the content of data from source file to the target file),
>>> my guess is that your attachment is copied in all the physical files for
>>> which a replication operation was applied.
>>>
>>> However, depending on your project requests, instead of attachment you can
>>> use a pointer which you can use it in shows (at the user's end). The
>>> limitations of such a method are imposed by the cross-domain limitations (if
>>> you use AJAX).
>>>
>>> I hope this answer will help you in designing your project and if somebody
>>> notice any mistake in my answer, please, correct me.
>>>
>>> Cheers,
>>> CGS
>>>
>>>
>>>
>>>
>>> On 10/28/2011 12:32 PM, Gregor Martynus wrote:
>>>
>>>> I wonder how couchDB stores document attachments internally. In
>>>> particular,
>>>> I'd like to know if I replicate a document with attachments from one
>>>> database to another, will the attachments be stored twice internally or
>>>> will
>>>> the couchDB be smart enough to understand that the attachment does already
>>>> exist and only needs to link to it?
>>>>
>>>> I hope my question is clear. In my case, each account has an own database
>>>> with its own documents. Now documents can be shared between accounts which
>>>> will be done using replication. But when attachments would get stored
>>>> multiple times although they are exactly the same I fear that it would use
>>>> up too much space and eventually slow down replications etc?
>>>>
>>>>


Mime
View raw message