Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8F67C9BE1 for ; Fri, 16 Mar 2012 14:42:16 +0000 (UTC) Received: (qmail 75362 invoked by uid 500); 16 Mar 2012 14:42:14 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 75311 invoked by uid 500); 16 Mar 2012 14:42:14 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 75303 invoked by uid 99); 16 Mar 2012 14:42:14 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Mar 2012 14:42:14 +0000 X-ASF-Spam-Status: No, hits=0.7 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [81.169.146.162] (HELO mo-p00-ob.rzone.de) (81.169.146.162) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Mar 2012 14:42:08 +0000 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; t=1331908906; l=1103; s=domk; d=gonvaled.com; h=Content-Transfer-Encoding:Content-Type:To:Subject:Date:From: References:In-Reply-To:MIME-Version:X-RZG-CLASS-ID:X-RZG-AUTH; bh=NjmaRvsn9EoE0uHqWcGyV4a2Hmk=; b=MEoCRXIgIEQkNshqoB6puB0kgfwFnr7PNj7XmwKd71ns/lOc2NDbVVap5fhsvnPzNoD 3JgNb5k+oqUAhkowrS9P/oc33fqGDksz4jIFCyDhLAy80nTnPibO3mWwmMWPX422GnY01 eEZUzGW/QN+629397fJ0Jj7glL5TEm5GkSU= X-RZG-AUTH: :K2MKY0GkfvuAYI9OvLYEA55J0qvTZZULi9CTHjqnn8/d41Z9VA5z1TAdjxyFQvE= X-RZG-CLASS-ID: mo00 Received: from mail-yw0-f52.google.com ([209.85.213.52]) by smtp.strato.de (jimi mo26) (RZmta 28.1 AUTH) with ESMTPA id t01a3ao2GEDFwy for ; Fri, 16 Mar 2012 15:41:45 +0100 (MET) Received: by yhpp61 with SMTP id p61so5246347yhp.11 for ; Fri, 16 Mar 2012 07:41:45 -0700 (PDT) Received: by 10.236.184.167 with SMTP id s27mr3107125yhm.8.1331908905080; Fri, 16 Mar 2012 07:41:45 -0700 (PDT) MIME-Version: 1.0 Received: by 10.147.9.11 with HTTP; Fri, 16 Mar 2012 07:41:24 -0700 (PDT) In-Reply-To: <003240CB-2132-439B-AD7D-6AFBB00DA352@apache.org> References: <003240CB-2132-439B-AD7D-6AFBB00DA352@apache.org> From: Daniel Gonzalez Date: Fri, 16 Mar 2012 15:41:24 +0100 Message-ID: Subject: Re: Size of couchdb documents To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org > > If memory serves the database's by_id tree uses Erlang term sorting for c= ollation instead of ICU. =A0ICU is of course the default collation option f= or MR views. =A0Regards, > > Adam That is interesting. I will try to confirm that, because that would mean that the dictionary that I am using now: "-@0123456789aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ" which is ICU ordered, would not be optimal for the doc_ids. Can you tell me what would an "Erlang term order" base64 dictionary look like? Anyway, I am curious: I understand that the size of doc_id is going to have big impact in performance and size of the database, since the doc_id is going to be present in a lot of internal structures. What I do not fully understand is why *ordering* of doc_ids when inserting documents in the database is going to have any effect in insert speed, or view generation. In my naive view of couchdb, the documents are just written to a big file system file as they are POSTed to couchdb, in the order that they arrive. How would the doc_id order affect this process?