Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 387969242 for ; Thu, 15 Mar 2012 15:15:45 +0000 (UTC) Received: (qmail 24629 invoked by uid 500); 15 Mar 2012 15:15:43 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 24570 invoked by uid 500); 15 Mar 2012 15:15:43 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 24561 invoked by uid 99); 15 Mar 2012 15:15:43 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Mar 2012 15:15:43 +0000 X-ASF-Spam-Status: No, hits=4.0 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_NEUTRAL,TRACKER_ID X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [81.169.146.162] (HELO mo-p00-ob.rzone.de) (81.169.146.162) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Mar 2012 15:15:35 +0000 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; t=1331824514; l=4213; s=domk; d=gonvaled.com; h=Content-Type:To:Subject:Date:From:References:In-Reply-To: MIME-Version:X-RZG-CLASS-ID:X-RZG-AUTH; bh=4U1sDhAeQFVzdTfaQfV7UGQyDC8=; b=t58iD03un+iVL1VPBv3SBfxKmqU6NZAkd7MMJtUGMfjn89aL5u5QyygTqcmwCn1giVs hek8vIa8G8a7Lz3RrpnihmHCMc46P0/FLNU4YBn3YhWmKvpefT4y8ebjtV5cln2U/2BO3 sm9ZJPPM6m7BUhb8H9zexW/XCDGDelqEgNo= X-RZG-AUTH: :K2MKY0GkfvuAYI9OvLYEA55J0qvTZZULi9CTHjqnn8/d41Z9VA5z1TMajRyBSJxL X-RZG-CLASS-ID: mo00 Received: from mail-gx0-f180.google.com ([209.85.161.180]) by smtp.strato.de (jimi mo48) (RZmta 28.1 AUTH) with ESMTPA id R02a22o2FEaDKZ for ; Thu, 15 Mar 2012 16:15:14 +0100 (MET) Received: by gglu1 with SMTP id u1so3889092ggl.11 for ; Thu, 15 Mar 2012 08:15:13 -0700 (PDT) Received: by 10.236.184.167 with SMTP id s27mr8898832yhm.8.1331824513312; Thu, 15 Mar 2012 08:15:13 -0700 (PDT) MIME-Version: 1.0 Received: by 10.147.9.11 with HTTP; Thu, 15 Mar 2012 08:14:53 -0700 (PDT) In-Reply-To: References: From: Daniel Gonzalez Date: Thu, 15 Mar 2012 16:14:53 +0100 Message-ID: Subject: Re: Size of couchdb documents To: user@couchdb.apache.org Content-Type: multipart/alternative; boundary=20cf303f6acc04e96204bb498f92 X-Virus-Checked: Checked by ClamAV on apache.org --20cf303f6acc04e96204bb498f92 Content-Type: text/plain; charset=ISO-8859-1 Hi Matthieu, This really seems to help. I am using now a base62 encoded monotonically increasing integer, which means my doc_id goes from "0" onwards, using the alphabet: ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789abcdefghijklmnopqrstuvwxyz I am getting now 3000 docs/s, more or less stable, and the size of my documents has decreased from 3KB to 0.4 KB. I am not sure whether this metrics will worsen when the database grows, but my feeling is that the situation has improved a lot just by changing the doc_id. I have one more question. Is the alphabet I have shown above "ordered" for couchdb? Thanks, Daniel On Thu, Mar 15, 2012 at 3:09 PM, Matthieu Rakotojaona < matthieu.rakotojaona@gmail.com> wrote: > On Thu, Mar 15, 2012 at 3:00 PM, Daniel Gonzalez > wrote: > > I understand the overheads that you are referring to, but it still > schocks > > me that Couchdb needs 8 times as much space to store the data. > > > > Are there any guidelines on what to do/avoid in order to get a lower > > overhead ratio? > > I got surprisingly good results when changing the _id design. I advise > you to follow what is written in this page : > http://wiki.apache.org/couchdb/Performance#File_size > > Basically : > - use shorter _ids > - use sequential _ids. If you cannot (eg because you have multiple > disconnected parts that will have to merge often and that would cause > too many clashes), you can use couchdb's own semi-sequential generated > uuids. Yes, uuids are contradictory to the first point. > > > -- > Matthieu RAKOTOJAONA > --20cf303f6acc04e96204bb498f92--