From user-return-20058-apmail-couchdb-user-archive=couchdb.apache.org@couchdb.apache.org Thu Mar 15 14:10:29 2012 Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AD19B9C51 for ; Thu, 15 Mar 2012 14:10:29 +0000 (UTC) Received: (qmail 67932 invoked by uid 500); 15 Mar 2012 14:10:28 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 67842 invoked by uid 500); 15 Mar 2012 14:10:27 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 67833 invoked by uid 99); 15 Mar 2012 14:10:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Mar 2012 14:10:27 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of matthieu.rakotojaona@gmail.com designates 74.125.82.180 as permitted sender) Received: from [74.125.82.180] (HELO mail-we0-f180.google.com) (74.125.82.180) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Mar 2012 14:10:21 +0000 Received: by werf3 with SMTP id f3so3778022wer.11 for ; Thu, 15 Mar 2012 07:10:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=qDbcs3097c5kEAOPUW5XYWWj4wzDXqIPahphM/BgGq0=; b=x9pJjCYIAT6RH4z5maf828l5UzEY09ZSeFHTsU+m4a3YKUpeK+Hg/UZzf1xcCy6h9K C1KBitW+SBHZCQbxTbDP90UvPqHcPdN5N1ytybvlhQAEsFV7WqYCmhgeh/NK7Ca3PNiP hgsQjHesD7Zfkhnl5IXvtjKHfkd6hqDcCxWpC1DRfs3kqdhJrurs13apKqps8nqsNYbM MGmvblBRFNXx753td0+L2XgvxXFEl23iZKCf6MrKPTsIXk4DcgilPVd89+BgSj05BLzA UQBMzQUnsJSXSq1WeKqIj/LJIjN/2amHgMdm/60dHg996pgRfnF2l6Y2AkS/0cjzQVpw aHuQ== Received: by 10.216.133.39 with SMTP id p39mr3940778wei.40.1331820600342; Thu, 15 Mar 2012 07:10:00 -0700 (PDT) MIME-Version: 1.0 Received: by 10.223.17.133 with HTTP; Thu, 15 Mar 2012 07:09:40 -0700 (PDT) In-Reply-To: References: From: Matthieu Rakotojaona Date: Thu, 15 Mar 2012 15:09:40 +0100 Message-ID: Subject: Re: Size of couchdb documents To: user@couchdb.apache.org Content-Type: text/plain; charset=UTF-8 X-Virus-Checked: Checked by ClamAV on apache.org On Thu, Mar 15, 2012 at 3:00 PM, Daniel Gonzalez wrote: > I understand the overheads that you are referring to, but it still schocks > me that Couchdb needs 8 times as much space to store the data. > > Are there any guidelines on what to do/avoid in order to get a lower > overhead ratio? I got surprisingly good results when changing the _id design. I advise you to follow what is written in this page : http://wiki.apache.org/couchdb/Performance#File_size Basically : - use shorter _ids - use sequential _ids. If you cannot (eg because you have multiple disconnected parts that will have to merge often and that would cause too many clashes), you can use couchdb's own semi-sequential generated uuids. Yes, uuids are contradictory to the first point. -- Matthieu RAKOTOJAONA