Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1CCADF969 for ; Tue, 26 Mar 2013 10:59:40 +0000 (UTC) Received: (qmail 57642 invoked by uid 500); 26 Mar 2013 10:59:38 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 57589 invoked by uid 500); 26 Mar 2013 10:59:38 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 57569 invoked by uid 99); 26 Mar 2013 10:59:38 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Mar 2013 10:59:38 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of dch@jsonified.com designates 209.85.215.51 as permitted sender) Received: from [209.85.215.51] (HELO mail-la0-f51.google.com) (209.85.215.51) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Mar 2013 10:59:33 +0000 Received: by mail-la0-f51.google.com with SMTP id fo13so13005980lab.24 for ; Tue, 26 Mar 2013 03:59:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jsonified.com; s=google; h=mime-version:x-received:x-originating-ip:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=Z9cdDi8+mL18iHs9nNiQaxjYwKoJN4JDisdpNfm81SI=; b=uNL/0KOq/rlXCO/SQexc7+Xt+LFZxGKvpQYoDyezkiA6FNgnnQ4xFOQ5bUAt7FBZvo t9Iac9hUQMUQ+C9Hqrv8oFif8Akv3ng/ySc3T/4kY9m4myoeqwX38waCUZx2bREo7g+/ pjqPXAzqzFo0hR+Z1/pae/M4XXYnAgYtwFBJc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:x-originating-ip:in-reply-to:references :date:message-id:subject:from:to:content-type:x-gm-message-state; bh=Z9cdDi8+mL18iHs9nNiQaxjYwKoJN4JDisdpNfm81SI=; b=hZo1be2OOZ7X9eYMQR0VMW2TzYa4w5vPfiBhCyqYSfjQR4dw1MSaLuf8IZU+UMw9jz KV1cw+5Oz5Ao8/MBGKacwp0XgD7wXwXUz+55sz4oKzBPitwjE89LLU9IuZZi8CzDKr2F T0H6vwXWYLK0t7D/HqGQRyVzuE5U7WZ3LbWNxmSx9GFJBPFrDb2NLcyqj32klpDiURX8 lDL3jichHa3bLHJ9+C85zRNUhikp8kt+rlfYEJsvaH/8A+4bYPUIA/t5zfLussvC2eyI WqEJN9mh0Js/lYbBf9a5uUboHvBdpE89XtS63oEXERPrTExm/zBw0j136+WP61DlhjJQ MbcQ== MIME-Version: 1.0 X-Received: by 10.152.47.242 with SMTP id g18mr7944564lan.42.1364295552136; Tue, 26 Mar 2013 03:59:12 -0700 (PDT) Received: by 10.112.81.169 with HTTP; Tue, 26 Mar 2013 03:59:12 -0700 (PDT) X-Originating-IP: [84.112.19.176] In-Reply-To: <1E48C571-BDC1-4E83-951E-F1BF00E72709@couchbase.com> References: <20130325221335.7f422cb0@svilendobrev.com> <1E48C571-BDC1-4E83-951E-F1BF00E72709@couchbase.com> Date: Tue, 26 Mar 2013 11:59:12 +0100 Message-ID: Subject: Re: huge attachments - experience? From: Dave Cottlehuber To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQmNZ7igGwd5WwmCWVwNjoLa+4TgiEmbzorcDZevFxS7J2DlzzpcaOZlAdM9lkHEm1Q+1tfJ X-Virus-Checked: Checked by ClamAV on apache.org On 25 March 2013 22:44, Jens Alfke wrote: > > On Mar 25, 2013, at 1:13 PM, svilen > wrote: > > As i don't really need more than 1 version back, i'm playing with idea > of using couchdb for that. Either putting the files as attachments, or > if not possible, using it as filesystem-miming synchronised metadata, > with appropriate listeners reacting on changes (like rename, mv, etc). +1 to all Jens & Nils said with 2 more points. If you store only metadata in couch, using a hash like md5 of the data instead of the actual filename, then using that to point to the stored files on disk is quite attractive. Renames, moves, are all internal to couchdb as the data hasn't changed. It will deduplicate itself as well if you have multiple copies (e.g. revisions of docs). The down side of putting stuff outside couch is that you need to manage the things you get for free: - easy replication model - deletion handling (how many docs have this file, should I delete this file now because the document attachment was deleted, etc) - streaming of data from within couchdb - inbuilt compression - keeping replication partners in sync (I don't need this doc anymore but the others don't yet have the updated copy type problems, esp in mesh replication topology) The other nasty thing about attachments in couch is that during replication, if there is a failure we can't restart part-way through. And as they're stored directly on disk, we duplicate that waste on both the network, and in storage inside the DB file. This may or may not be a problem for your use case. A+ Dave