incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "maku@makuchaku.in" <m...@makuchaku.in>
Subject Re: Using couchdb for analytics
Date Thu, 02 Jun 2011 15:04:26 GMT
Its 700 req/min :)
--
Mayank
http://adomado.com



On Thu, Jun 2, 2011 at 7:10 PM, Jan Lehnardt <jan@apache.org> wrote:

>
> On 2 Jun 2011, at 13:28, maku@makuchaku.in wrote:
>
> > Forgot to mention...
> > All of these 700 req/sec are write requests (data logging) & no data
> crunching.
> > Our current inhouse analytics solution (built on Rails, Mysql) gets
> >>
> >> about 700 req/min on an average day...
>
> min or sec? :)
>
> Cheers
> Jan
> --
>
>
> >>
> >> --
> >> Mayank
> >> http://adomado.com
> >>
> >>
> >>
> >>
> >> On Thu, Jun 2, 2011 at 3:16 PM, Gabor Ratky <rgabo@rgabostyle.com>
> wrote:
> >>> Take a look at update handlers [1]. It is a more lightweight way to
> create / update your visitor documents, without having to GET the document,
> modify and PUT back the whole thing. It also simplifies dealing with
> document revisions as my understanding is that you should not be running
> into conflicts.
> >>>
> >>> I wouldn't expect any problem handling the concurrent traffic and
> tracking the users, but the view indexer will take some time with the
> processing itself. You can always replicate the database (or parts of it
> using a replication filter) to another CouchDB instance and perform the
> crunching there.
> >>>
> >>> It's fairly vague how much updates / writes your 2k-5k traffic would
> cause. How many requests/sec on your site? How many property updates that
> causes?
> >>>
> >>> Btw, CouchDB users, is there any way to perform bulk updates using
> update handlers, similar to _bulk_docs?
> >>>
> >>> Gabor
> >>>
> >>> [1] http://wiki.apache.org/couchdb/Document_Update_Handlers
> >>>
> >>> On Thursday, June 2, 2011 at 11:34 AM, maku@makuchaku.in wrote:
> >>>
> >>>> Hi everyone,
> >>>>
> >>>> I came across couchdb a couple of weeks back & got really excited
by
> >>>> the fundamental change it brings by simply taking the app-server out
> >>>> of the picture.
> >>>> Must say, kudos to the dev team!
> >>>>
> >>>> I am planning to write a quick analytics solution for my website -
> >>>> something on the lines of Google analytics - which will measure
> >>>> certain properties of the visitors hitting our site.
> >>>>
> >>>> Since this is my first attempt at a JSON style document store, I
> >>>> thought I'll share the architecture & see if I can make it better
(or
> >>>> correct my mistakes before I do them) :-)
> >>>>
> >>>> - For each unique visitor, create a document with his session_id as
> the doc.id
> >>>> - For each property i need to track about this visitor, I create a
> >>>> key-value pair in the doc created for this visitor
> >>>> - If visitor is a returning user, use the session_id to re-open his
> >>>> doc & keep on modifying the properties
> >>>> - At end of each calculation time period (say 1 hour or 24 hours), I
> >>>> run a cron job which fires the map-reduce jobs by requesting the views
> >>>> over curl/http.
> >>>>
> >>>> A couple of questions based on above architecture...
> >>>> We see concurrent traffic ranging from 2k users to 5k users.
> >>>> - Would a couchdb instance running on a good machine (say High CPU
> >>>> EC2, medium instance) work well with simultaneous writes happening...
> >>>> (visitors browsing, properties changing or getting created)
> >>>> - With a couple of million documents, would I be able to process my
> >>>> views without causing any significant impact to write performance?
> >>>>
> >>>> I think my questions might be biased by the fact that I come from a
> >>>> MySQL/Rails background... :-)
> >>>>
> >>>> Let me know how you guys think about this.
> >>>>
> >>>> Thanks in advance,
> >>>> --
> >>>> Mayank
> >>>> http://adomado.com
> >>>
> >>>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message