Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 27682 invoked from network); 7 Jul 2009 20:53:19 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 7 Jul 2009 20:53:19 -0000 Received: (qmail 48053 invoked by uid 500); 7 Jul 2009 20:53:28 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 47987 invoked by uid 500); 7 Jul 2009 20:53:28 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 47977 invoked by uid 99); 7 Jul 2009 20:53:28 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Jul 2009 20:53:28 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [75.59.196.2] (HELO mailout.abaca.com) (75.59.196.2) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Jul 2009 20:53:20 +0000 Received: from [172.16.83.18] (localhost.localdomain [127.0.0.1]) by v-abaca-epg.abaca.com (Postfix-out) with ESMTP id E9073110091 for ; Tue, 7 Jul 2009 14:06:53 -0700 (PDT) X-Propel-Return-Path: Received: from mailout.abaca.com ([172.16.83.19]) by [127.0.0.1] ([127.0.0.1]) (port 7027) (Abaca EPG outproxy filter 3.1.1.exported $Rev: 9447 $) id 5uNZr977l6R0; Tue, 07 Jul 2009 14:06:53 -0700 Received: from clx-outmilter.localdomain (unknown [172.16.80.159]) by v-abaca-epg.abaca.com (Postfix-out) with ESMTP id B0913110063 for ; Tue, 7 Jul 2009 14:06:53 -0700 (PDT) Received: from mail.abaca.com (unknown [172.16.83.4]) by clx-outmilter.localdomain (Postfix) with ESMTP id 1676BD8051 for ; Tue, 7 Jul 2009 16:52:58 -0400 (EDT) Received: from mail1.ABACA.local ([172.16.83.4]) by mail1.ABACA.local ([172.16.83.4]) with mapi; Tue, 7 Jul 2009 13:52:58 -0700 From: Peter Hsu To: "user@couchdb.apache.org" Date: Tue, 7 Jul 2009 13:52:56 -0700 Subject: RE: view size extremely disk inefficient Thread-Topic: view size extremely disk inefficient Thread-Index: Acn/N2iE2hXAPzZmR1qxnKexNM8o8QADVkeQ Message-ID: <22E9DC658696E640A074D8B363795BE62D6C5C1C65@mail1.ABACA.local> References: <22E9DC658696E640A074D8B363795BE62D6C5C1C54@mail1.ABACA.local> <830466B8-361B-47A4-89AE-1990FD4CBBD8@apache.org> In-Reply-To: <830466B8-361B-47A4-89AE-1990FD4CBBD8@apache.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Propel-ID: 5uNZr977l6R0 X-Virus-Checked: Checked by ClamAV on apache.org Is that available in 0.9.0? I tried a POST but I get a 405 error. [Tue, 07 Jul 2009 13:46:08 GMT] [info] [<0.22689.306>] 172.16.80.64 - - 'GE= T' /mydb/_compact/meta_in 404 [Tue, 07 Jul 2009 13:47:43 GMT] [info] [<0.22706.306>] 172.16.80.64 - - 'PO= ST' /mydb/_compact/meta_in 405 -----Original Message----- From: Adam Kocoloski [mailto:kocolosk@apache.org]=20 Sent: Tuesday, July 07, 2009 12:16 PM To: user@couchdb.apache.org Subject: Re: view size extremely disk inefficient On Jul 7, 2009, at 2:56 PM, Peter Hsu wrote: > I need help explaining why the sizes of my views are so large. > > The emitted rows for the document have a key length of about 40 =20 > bytes (it's an array, if that matters) and a view length of about =20 > 400 bytes (raw json). However, I'm seeing over 2k/row average over =20 > the view. Factoring in the overhead of writing the btrees still =20 > doesn't really make sense. > > At 10M docs, I have almost 3.2k/message. The views were generated =20 > at 3000 doc increments. At the beginning, with the first view =20 > generation, the view size was about 2MB. This is about 800 bytes =20 > per row, which could be reasonable. > > I measured the incremental size of the view after every 3000 rows =20 > were added to the view. By the time I'm at 30k rows, I'm seeing an =20 > increment in the view size of 4MB, which is over 1k/doc. By the end =20 > of 10M messages, it's over a 10MB increment, which is over 3k/doc. > > It may be interesting that a lot of my keys are identical. Does =20 > that affect things? > > This is with the 0.9.0 running on cent5 (64 bit). > > Peter Hi Peter, have you tried compacting the view? POST /dbname/_compact/designname Best, Adam