Return-Path: Delivered-To: apmail-incubator-couchdb-user-archive@locus.apache.org Received: (qmail 6770 invoked from network); 9 Sep 2008 09:54:06 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 9 Sep 2008 09:54:06 -0000 Received: (qmail 5797 invoked by uid 500); 9 Sep 2008 09:54:03 -0000 Delivered-To: apmail-incubator-couchdb-user-archive@incubator.apache.org Received: (qmail 5762 invoked by uid 500); 9 Sep 2008 09:54:03 -0000 Mailing-List: contact couchdb-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: couchdb-user@incubator.apache.org Delivered-To: mailing list couchdb-user@incubator.apache.org Received: (qmail 5751 invoked by uid 99); 9 Sep 2008 09:54:03 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Sep 2008 02:54:03 -0700 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [83.97.50.139] (HELO jan.prima.de) (83.97.50.139) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Sep 2008 09:53:04 +0000 Received: from [192.168.178.43] (f053003153.adsl.alicedsl.de [::ffff:78.53.3.153]) (AUTH: LOGIN jan, SSL: TLSv1/SSLv3,128bits,AES128-SHA) by jan.prima.de with esmtp; Tue, 09 Sep 2008 09:47:33 +0000 Message-Id: <1BDAF8CA-0962-4D85-99BB-C255A0FA3EC4@apache.org> From: Jan Lehnardt To: couchdb-user@incubator.apache.org In-Reply-To: <39670022-DC1E-4393-8FEA-7C3908A40E2E@gmail.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v928.1) Subject: Re: DMS, _attachment or path_field? Date: Tue, 9 Sep 2008 11:47:02 +0200 References: <39670022-DC1E-4393-8FEA-7C3908A40E2E@gmail.com> X-Mailer: Apple Mail (2.928.1) X-Virus-Checked: Checked by ClamAV on apache.org On Sep 9, 2008, at 11:14 , Anselmo Silva wrote: > When Building a document management system, will you consider the > binary _attachment (as couchdb current feature) or a path_field to a > server/file_system (protected) enviroment? > > its an ongoing question about saving binary into a DB ( even with > couchDB ). From this we can raise some high-level architecture > question: > > - Will the binary _attachments affect the rebuild index views ( even > with append ) ? > - How about replication?( I think it would be hard and intense when > dealing with higher db size values. ) A couple of notes: If all your stuff is in the DB, managing said stuff becomes easier. Not using a database as a blob store is usually recommended because data needs to pass the border of user- and kernel-land a few times before being sent. The sendfile() syscall helps here, but Erlang developers say they don't see a measurable difference. So this looks like a non-issue in Erlang-land and hence CouchDB. If you keep your files external to CouchDB, you need to manage deletes and updates and everything. If you mix in replication, you need to manage replication as well. If that is easier or harder for you depends on your setup. Attachments have no impact on view index creation time. The more data is in a doc, the more resources you need to replicate said doc. It is also very convenient, see above. Fast, convenient, efficient: pick two. I don't think that is as big of an architectural question as it might sound. Start by building an app that works. If profiling shows that attachment replication is your bottleneck, think about solving that in a way that doesn't hurt you. If you opt for a more complex external solution now, no one can guarantee that this won't include a bottleneck when it comes to profiling. Ripping out attachments and do manual handling is not that big a deal (imho), c.f. DbUpdateNotifications. Cheers Jan --