Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4135910976 for ; Tue, 23 Jul 2013 00:24:05 +0000 (UTC) Received: (qmail 56452 invoked by uid 500); 23 Jul 2013 00:24:03 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 56403 invoked by uid 500); 23 Jul 2013 00:24:03 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 56395 invoked by uid 99); 23 Jul 2013 00:24:03 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Jul 2013 00:24:03 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW X-Spam-Check-By: apache.org Received-SPF: error (nike.apache.org: local policy) Received: from [74.125.83.43] (HELO mail-ee0-f43.google.com) (74.125.83.43) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Jul 2013 00:23:56 +0000 Received: by mail-ee0-f43.google.com with SMTP id l10so4097279eei.16 for ; Mon, 22 Jul 2013 17:23:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer :x-gm-message-state; bh=WIaLLZgDb0i05BLQ1ccq9a0R1E2d3xmxphWJli27aYY=; b=C6Oh1ABNCGz6Z9TJc93c9Q/hz48RUX96FoPEhdxgaA2FDbg/xgnxdBL5DhNucO3/Jg fVLvVmn0sMUuIQx+Psi+XvCwotVRa7XThMGqXe5SrXmE8kLdAm5KxkWmIbkwf4U59W2j 2lgQBP/ds+SYa/MI4vzeKm5BWT8YzCq9afB2HOqa8458KC+e7N9eWDn3BJJitw9J8JcG Iyjb5gv6DQuzWTkwdVz42NH6PtIUZG3E96+Edk1XNEGAgMd/5W/rA7cJ6KBp98bEo9HB wdPWBV/43ihAI6uWcuLi4WjmgZ9j0T2NayeCIUcPdwM/c5G32DGFnGmc04SBe6rvQLTC b8xg== X-Received: by 10.14.176.199 with SMTP id b47mr30143722eem.117.1374538996226; Mon, 22 Jul 2013 17:23:16 -0700 (PDT) Received: from [192.168.2.112] ([77.72.35.178]) by mx.google.com with ESMTPSA id n45sm54770467eew.1.2013.07.22.17.23.14 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 22 Jul 2013 17:23:15 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1283) Subject: Re: Get the 20 posts, of the last seven days, ordered by number of hits From: Filippo Fadda In-Reply-To: <4BF6C96C-E49A-44BB-826C-A5BB7B35B0AA@programmazione.it> Date: Tue, 23 Jul 2013 02:23:13 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <18E0ED04-5B09-49FA-A9D8-29A1332ECA9B@programmazione.it> References: <565CA793-C9DA-4BFE-AB3E-17D9E9287A69@programmazione.it> <1EC6F8E4-19BE-4368-9B8E-9B9EA317BAD6@calftrail.com> <4BF6C96C-E49A-44BB-826C-A5BB7B35B0AA@programmazione.it> To: user@couchdb.apache.org X-Mailer: Apple Mail (2.1283) X-Gm-Message-State: ALoCoQlZU0XRSiHiud/eCZY10PpLZfwZ8YiPeuWHMr11n7aijt/RnTESzd9z6LNQhPAmFPTGtjdM X-Virus-Checked: Checked by ClamAV on apache.org I wanna just provide an hint for the one are dealing with the same = issue. I was importing over 21000 posts, and for each one I was generating a = number of documents of type 'hit' equals to the number each article has = been viewed. This approach let you track the number of views per post, = avoiding update conflicts. But unfortunately I found that just the database itself (without any = view) has grown to 8 GB after having imported 1400 documents. So, at the = end of the process, I can imagine a database of about 120 GB without any = views. So, this strategy can't be applied, unless you have a huge disk = in the order of terabytes because each views will reclaim a lot of = space. I think, to avoid conflicts on posts updates, it's a better idea using = another document type, called 'viewcount', to store the number of hits = of each post, and run a compaction every few days to remove all the = emitted 'hit' documents. Those 'hit' documents in fact are going to get = a lot of space. The thing I noticed is that CouchDB really needs patterns. Patterns are = generic solutions to common and recurring problems. People, including = myself, don't know how to model their schema to obtain a specific = result, because a lack of experience or more simply, because CouchDB = needs patterns like any RDBMS does. So, maybe it's a good idea writing a = CouchDB patterns book. :-) -Filippo=