incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Filippo Fadda <filippo.fa...@programmazione.it>
Subject Re: Get the 20 posts, of the last seven days, ordered by number of hits
Date Tue, 23 Jul 2013 00:23:13 GMT
I wanna just provide an hint for the one are dealing with the same issue.
I was importing over 21000 posts, and for each one I was generating a number of documents
of type 'hit' equals to the number each article has been viewed. This approach let you track
the number of views per post, avoiding update conflicts.
But unfortunately I found that just the database itself (without any view) has grown to 8
GB after having imported 1400 documents. So, at the end of the process, I can imagine a database
of about 120 GB without any views. So, this strategy can't be applied, unless you have a huge
disk in the order of terabytes because each views will reclaim a lot of space.
I think, to avoid conflicts on posts updates, it's a better idea using another document type,
called 'viewcount', to store the number of hits of each post, and run a compaction every few
days to remove all the emitted 'hit' documents. Those 'hit' documents in fact are going to
get a lot of space.

The thing I noticed is that CouchDB really needs patterns. Patterns are generic solutions
to common and recurring problems. People, including myself, don't know how to model their
schema to obtain a specific result, because a lack of experience or more simply, because CouchDB
needs patterns like any RDBMS does. So, maybe it's a good idea writing a CouchDB patterns
book. :-)

-Filippo
Mime
View raw message