couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Bardubitzki <step...@bardubitzki.com>
Subject Re: Tracking doc access
Date Fri, 15 Mar 2013 03:59:09 GMT
Hi Wendall,

this is something I was looking for. Also, thanks for your feedback on 
performance, you have saved me a lot of time.

Stephan


On 13-03-14 08:22 PM, Wendall Cada wrote:
> The performance of a write per read in updating the doc with a 
> timestamp would be very, very poor in CouchDB.
>
> The best scenario is create a separate stats database. Every time a 
> doc in the database you are tracking for is accessed, create a doc 
> describing the request in a stats database. Creating new docs in 
> CouchDB is very inexpensive, so you'll not see any performance issues 
> with this versus updating docs per request.
>
> Create a new doc in the stats db like this:
> {
> "db": "name_of_tracked_db",
> "id": "_id_of_doc_being_tracked",
> "timestamp": timestamp
> }
>
> Then create a view in this database for your database that maps the 
> values. You can create several view indexes to separate the data for 
> whatever your needs are.
>
> To view :
> "doc_access": {
>     "map": "function(doc) {
>         emit([doc.db, doc.id, doc.timestamp], 1);
>     }",
>     "reduce": "_sum"
> }
>
> A mock query for this to see the number of times a doc was accessed 
> over the entire date range would be:
>
> http://localhost:5984/stats/_design/data/_view/doc_access?startkey=["name_of_tracked_db","_id_of_doc_being_tracked",""]&endkey=["name_of_tracked_db","_id_of_doc_being_tracked",{}]&group=true

>
>
> You'd get back a result like this:
> {"rows": [
> {"key":["name_of_tracked_db","_id_of_doc_being_tracked"], "value": 42}
> ]}
>
> If you want to get results for a specific range of dates, simply add 
> the dates to the third component of the query.
>
> This method gives you the ability to get stats for the access counts 
> for an entire db, a range of docs, or a single doc for any given 
> period of time.
>
> The advantage of this approach 1. it's fast 2. it is extremely flexible
>
> The disadvantage is that it takes up a ton of disk space if you never 
> purge old items from the db. I've been tracking every single page 
> request to our servers in this way with quite a bit of metadata in the 
> docs since Dec. 2010. That database is currently 5GB compacted for 
> ~50k page requests per day over this period of time. I never had the 
> need to delete a single doc from this db.
>
> I don't have any benchmarks for a comparison between the two methods, 
> but I'd strongly discourage a write per read model for your accessed 
> docs.
>
> For an understanding about how the ordering for views works, see 
> http://wiki.apache.org/couchdb/View_collation
>
> HTH,
>
> Wendall
>
> On 03/14/2013 07:16 PM, Stephan Bardubitzki wrote:
>> Hi Thomas,
>>
>> no, I need only to track read, and I need the timestamp for some charts.
>>
>> Stephan
>>
>> On 13-03-14 07:02 PM, Thomas Hommers wrote:
>>> Hi Stephan,
>>>
>>> With 'accessed' do you mean read and write ? In case you just want 
>>> to track write access i believe you could use the _rev attribute.
>>>
>>> Regards
>>> Thomas
>>>
>>>
>>>
>>> ----- Reply message -----
>>> From: "Stephan Bardubitzki" <stephan@bardubitzki.com>
>>> To: "user@couchdb.apache.org" <user@couchdb.apache.org>
>>> Subject: Tracking doc access
>>> Date: Fri, Mar 15, 2013 08:57
>>>
>>>
>>>
>>> Hi there,
>>>
>>> I have a task where I need to track how often a doc is accessed. The 
>>> two
>>> possible ways I can think of are:
>>>
>>>   1. add an array to the doc and add the timestamp when it is accessed
>>>   2. create a new document and add the doc._id and the timestamp
>>>
>>> Which one would you prefer? Or is there a better solution?
>>>
>>> Thanks,
>>> Stephan
>>>
>>>
>>> --------------------------------
>>> Spam/Virus scanning by CanIt Pro
>>>
>>> For more information see
>>> http://www.kgbinternet.com/SpamFilter.htm
>>>
>>> To control your spam filter, log in at
>>> http://filter.kgbinternet.com
>>>
>>
>
>
> --------------------------------
> Spam/Virus scanning by CanIt Pro
>
> For more information see
> http://www.kgbinternet.com/SpamFilter.htm
>
> To control your spam filter, log in at
> http://filter.kgbinternet.com
>


Mime
View raw message