incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <rnew...@apache.org>
Subject Re: same attachment across documents / databases
Date Fri, 28 Oct 2011 11:25:30 GMT
The approach would be to teach couchdb how to deduplicate
byte-identical attachments (or chunks thereof) with a file. Sounds a
bit tricky but not impossible.

B.

On 28 October 2011 12:22, Gregor Martynus <gregor@martynus.net> wrote:
> Thanks for your responses!
>
> I'm not sure if there is any approach to go minimize the disadvantage of
> replicated attachments eating up space and performance, if there is, please
> let me know.
>
> My approach would be to setup a backend server that listens to new
> attachments coming in, transferring these to an external store like S3 and
> then replace the doc attachment in the DB with some kind of pointer to the
> new location of the attachments.
>
> Not sure if that makes sense, I'm open for suggestions.
>
> And once more thanks for your help!
>
> On Fri, Oct 28, 2011 at 1:14 PM, CGS <cgsmcmlxxv@gmail.com> wrote:
>
>> Hi Gregor,
>>
>> I might be wrong because I am no expert in that field. But from the
>> documentation, one can deduce that all the attachments are inserted into the
>> document and not pointing toward a physical file (quite logic if you
>> consider the main purpose of CouchDB: web-oriented database). As replication
>> mechanism is the same for local replication and replication over the network
>> (just transferring the content of data from source file to the target file),
>> my guess is that your attachment is copied in all the physical files for
>> which a replication operation was applied.
>>
>> However, depending on your project requests, instead of attachment you can
>> use a pointer which you can use it in shows (at the user's end). The
>> limitations of such a method are imposed by the cross-domain limitations (if
>> you use AJAX).
>>
>> I hope this answer will help you in designing your project and if somebody
>> notice any mistake in my answer, please, correct me.
>>
>> Cheers,
>> CGS
>>
>>
>>
>>
>> On 10/28/2011 12:32 PM, Gregor Martynus wrote:
>>
>>> I wonder how couchDB stores document attachments internally. In
>>> particular,
>>> I'd like to know if I replicate a document with attachments from one
>>> database to another, will the attachments be stored twice internally or
>>> will
>>> the couchDB be smart enough to understand that the attachment does already
>>> exist and only needs to link to it?
>>>
>>> I hope my question is clear. In my case, each account has an own database
>>> with its own documents. Now documents can be shared between accounts which
>>> will be done using replication. But when attachments would get stored
>>> multiple times although they are exactly the same I fear that it would use
>>> up too much space and eventually slow down replications etc?
>>>
>>>
>>
>

Mime
View raw message