incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Fot <dmitriy....@gmail.com>
Subject CouchDB document design for facebook insights
Date Fri, 07 Dec 2012 08:32:52 GMT
Hi All,

As many other, I am new to CouchDB, and therefore not sure about the
proper usage of this technology. Especially when it comes to the
design of a document and views.

I am going to use CouchDB for analytical information based on Facebook
insights and other sources. We are going to collect the analytical
information overtime and keep it forever, then, of course, we would
like to build analytical reports based on this information.

My main concern is a proper design of a document as we are going to
have millions of them. And, If possible, I would like more experienced
CouchDB users to see it and warn me if I am about to make a big
mistake.

The proposed design of a document:

{
   "_id": "0b69a33807d4cb63680dbebc16000af5",
   "_rev": "1-7c9916592c377e32cf83acf746a8647c",
   "metrics": [       //array of metrics, one element per facebook
page, around 10 pages per document
       {
           "sourceId": "210627525692699", //facebook page ID
           "source": "facebook",
           "values": {
               "page_likes": 53
               //many more other metrics, around 100
           }
       },
       {
           "sourceId": "354413697924499", // //facebook page ID
           "source": "facebook",
           "values": {
               "page_wall_posts_source_unique": {other: 0, composer: 1},
               "page_likes": 12
               //many more other metrics, around 100
           }
       }
   ],
   "timestamp": [
       2012,
       10,
       15,
       10,
       0,
       0
   ],
   "customerId": "71ff942f-9283-4916-ab84-4927bce09117"
}

Expected number of documents: +10 000 every hour, +240 000 every day.

Expected requests to the documents:
- sum of values per customer, per sourceId, per metric in a given time period
- specialized views for more complex metrics


Questions:
- In order to get analytics for some complex metrics (like
page_wall_posts_source_unique) we will need to build specialized
views, probably many of them, should I expect problems with view
update time?
- Is it right decision to use an array for the timestamp, or it is
better to use a long?
- Should I use one design document or put every view in a new one?

any comment is appreciated, thank you

Dmitriy Fot
---------------------------------
dmitriy.fot@gmail.com

Mime
View raw message