Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DF814EC8B for ; Fri, 15 Mar 2013 03:23:23 +0000 (UTC) Received: (qmail 5085 invoked by uid 500); 15 Mar 2013 03:23:22 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 4844 invoked by uid 500); 15 Mar 2013 03:23:21 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 4819 invoked by uid 99); 15 Mar 2013 03:23:20 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Mar 2013 03:23:20 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of wendallc@83864.com designates 209.85.192.176 as permitted sender) Received: from [209.85.192.176] (HELO mail-pd0-f176.google.com) (209.85.192.176) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Mar 2013 03:23:15 +0000 Received: by mail-pd0-f176.google.com with SMTP id t12so132903pdi.7 for ; Thu, 14 Mar 2013 20:22:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=83864.com; s=google; h=x-received:message-id:date:from:user-agent:mime-version:to:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=hFH+agbXffQ9wQ3VDyIsf0hVdPhaBv71+RVF9SWH/0A=; b=GKnuWW2QX0vRbrwU+Hd3wz/pb1wwTmjx4F/IeRgYy6DkznRTAGJ0VybwBPehz1z0SS okVFZ5XU6HNlSoFiVrSdcEh/E7lOXfYCgXKNWxfU1hggu7U2xxHGBu9ex7uO2lTYC76n og27ibxFJAfHB8BUtDUVmVcWFZEy36fbuNPY4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:message-id:date:from:user-agent:mime-version:to:subject :references:in-reply-to:content-type:content-transfer-encoding :x-gm-message-state; bh=hFH+agbXffQ9wQ3VDyIsf0hVdPhaBv71+RVF9SWH/0A=; b=LEXFasmDPlXANyVHl+QRUUZ/phpa2aQ/M6tgq5mOCgaztEHgh8K4/6RIcKE6G7gHFY ohp/Qqm6IRks3DuFb8SDPQ0ZvZ1nt2RhqtZH79isipx0+7Qx7AYJWn7wRvPA7Tp+SGx6 1zpiPIjvuMepH0E2wXCLrTjKBUYqAB8h9OjyAWmHstiu2DfGHuAxZxwVin0aao/V06Iu MokxuhtE5s/3sR97tvcqN6Qbqn+50zZZbI0cHwgU3mX7VX4Fyak0wdsT9Z8nYCL2JAX1 m1x2MMLmzC55luZYUbXkqFHvztVuIJOBE8KgBIiEQsA2/Q1ud1kq/xrc+xT7jLlxHrX4 IEvw== X-Received: by 10.68.189.8 with SMTP id ge8mr12194718pbc.166.1363317773063; Thu, 14 Mar 2013 20:22:53 -0700 (PDT) Received: from wlaptop.localdomain (c-67-170-132-85.hsd1.or.comcast.net. [67.170.132.85]) by mx.google.com with ESMTPS id rl3sm6306613pbb.28.2013.03.14.20.22.50 (version=TLSv1 cipher=RC4-SHA bits=128/128); Thu, 14 Mar 2013 20:22:51 -0700 (PDT) Message-ID: <5142940A.7010409@83864.com> Date: Thu, 14 Mar 2013 20:22:50 -0700 From: Wendall Cada User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130219 Thunderbird/17.0.3 MIME-Version: 1.0 To: user@couchdb.apache.org Subject: Re: Tracking doc access References: <514271C0.3080806@bardubitzki.com> <51428482.6080903@bardubitzki.com> In-Reply-To: <51428482.6080903@bardubitzki.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Gm-Message-State: ALoCoQnBct32gRp1O7nI3swWMkML6Gg6yEFe84sWXUIAp1oktc3pdwAJ6cBbtyNpoEgfPWDqBG3S X-Virus-Checked: Checked by ClamAV on apache.org The performance of a write per read in updating the doc with a timestamp would be very, very poor in CouchDB. The best scenario is create a separate stats database. Every time a doc in the database you are tracking for is accessed, create a doc describing the request in a stats database. Creating new docs in CouchDB is very inexpensive, so you'll not see any performance issues with this versus updating docs per request. Create a new doc in the stats db like this: { "db": "name_of_tracked_db", "id": "_id_of_doc_being_tracked", "timestamp": timestamp } Then create a view in this database for your database that maps the values. You can create several view indexes to separate the data for whatever your needs are. To view : "doc_access": { "map": "function(doc) { emit([doc.db, doc.id, doc.timestamp], 1); }", "reduce": "_sum" } A mock query for this to see the number of times a doc was accessed over the entire date range would be: http://localhost:5984/stats/_design/data/_view/doc_access?startkey=["name_of_tracked_db","_id_of_doc_being_tracked",""]&endkey=["name_of_tracked_db","_id_of_doc_being_tracked",{}]&group=true You'd get back a result like this: {"rows": [ {"key":["name_of_tracked_db","_id_of_doc_being_tracked"], "value": 42} ]} If you want to get results for a specific range of dates, simply add the dates to the third component of the query. This method gives you the ability to get stats for the access counts for an entire db, a range of docs, or a single doc for any given period of time. The advantage of this approach 1. it's fast 2. it is extremely flexible The disadvantage is that it takes up a ton of disk space if you never purge old items from the db. I've been tracking every single page request to our servers in this way with quite a bit of metadata in the docs since Dec. 2010. That database is currently 5GB compacted for ~50k page requests per day over this period of time. I never had the need to delete a single doc from this db. I don't have any benchmarks for a comparison between the two methods, but I'd strongly discourage a write per read model for your accessed docs. For an understanding about how the ordering for views works, see http://wiki.apache.org/couchdb/View_collation HTH, Wendall On 03/14/2013 07:16 PM, Stephan Bardubitzki wrote: > Hi Thomas, > > no, I need only to track read, and I need the timestamp for some charts. > > Stephan > > On 13-03-14 07:02 PM, Thomas Hommers wrote: >> Hi Stephan, >> >> With 'accessed' do you mean read and write ? In case you just want to >> track write access i believe you could use the _rev attribute. >> >> Regards >> Thomas >> >> >> >> ----- Reply message ----- >> From: "Stephan Bardubitzki" >> To: "user@couchdb.apache.org" >> Subject: Tracking doc access >> Date: Fri, Mar 15, 2013 08:57 >> >> >> >> Hi there, >> >> I have a task where I need to track how often a doc is accessed. The two >> possible ways I can think of are: >> >> 1. add an array to the doc and add the timestamp when it is accessed >> 2. create a new document and add the doc._id and the timestamp >> >> Which one would you prefer? Or is there a better solution? >> >> Thanks, >> Stephan >> >> >> -------------------------------- >> Spam/Virus scanning by CanIt Pro >> >> For more information see >> http://www.kgbinternet.com/SpamFilter.htm >> >> To control your spam filter, log in at >> http://filter.kgbinternet.com >> >