couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "maku@makuchaku.in" <m...@makuchaku.in>
Subject Using couchdb for analytics
Date Thu, 02 Jun 2011 09:34:03 GMT
Hi everyone,

I came across couchdb a couple of weeks back & got really excited by
the fundamental change it brings by simply taking the app-server out
of the picture.
Must say, kudos to the dev team!

I am planning to write a quick analytics solution for my website -
something on the lines of Google analytics - which will measure
certain properties of the visitors hitting our site.

Since this is my first attempt at a JSON style document store, I
thought I'll share the architecture & see if I can make it better (or
correct my mistakes before I do them) :-)

- For each unique visitor, create a document with his session_id as the doc.id
- For each property i need to track about this visitor, I create a
key-value pair in the doc created for this visitor
- If visitor is a returning user, use the session_id to re-open his
doc & keep on modifying the properties
- At end of each calculation time period (say 1 hour or 24 hours), I
run a cron job which fires the map-reduce jobs by requesting the views
over curl/http.

A couple of questions based on above architecture...
We see concurrent traffic ranging from 2k users to 5k users.
- Would a couchdb instance running on a good machine (say High CPU
EC2, medium instance) work well with simultaneous writes happening...
(visitors browsing, properties changing or getting created)
- With a couple of million documents, would I be able to process my
views without causing any significant impact to write performance?

I think my questions might be biased by the fact that I come from a
MySQL/Rails background... :-)

Let me know how you guys think about this.

Thanks in advance,
--
Mayank
http://adomado.com

Mime
View raw message