incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Klein <st.fankl...@gmail.com>
Subject Re: Data partitioning strategy
Date Thu, 31 Jul 2014 10:36:23 GMT
Hi,

I'd say it strongly depends on how you want to aggregate the data.
Just to give an example, if you can assign each number to a person and
persons to departments and you are only interested in aggregated data by
department it would probably make sense to partition the data by department.
So what kind of reporting are you interested in?

Site note: If you stick to a DB / year i wouldn't replicate the data but
move the db file on new year's eve (cron or at), not sure if you have to
stop and start couchdb for that or if its sufficient to post to _restart
after the move though.

regards,
Stefan




2014-07-31 12:13 GMT+02:00 Daniel Gonzalez <gonvaled@gonvaled.com>:

> I have a big databases with calls generated from a Telephony System
> (Asterisk).
>
> Hi *,
>
> I have asterisk servers creating calls in their local databases and
> replicating those to a central server to aggregate all calls in the
> system. The calls database has also several related views which
> together with the big amount of data, is making the database grow too
> big. I have decided to do the following partitioning, based on the
> calldate:
>
> 1. Keep a full database for backup purposes (calls-all). This can be
> moved to another a backup server, and deleted whenever I feel it is no
> longer needed.
> 2. Keep a current database with the calls from the last two years:
> calls-curr (currently with 2003 and 2004) so that customers an access
> the recent history.
> 3. Archive old calls to per-year databases calls-2010, calls-2011,
> calls-2012.
>
> The goal is to delete the database calls-all whenever I feel that this
> strategy is implemented correctly.
>
> And the replications would be set-up like this (continuous replication):
>
> ast1-server            -> calls-curr@main-server    (to aggregate
> calls for the current years)
> calls-curr@main-server -> calls-all@main-server     (to aggregate all
> calls, for backup purposes)
>
> And to archive the calls, I would do a one-off filtered replication like
> this:
>
> calls-all@main-server -> calls-2010@main-server (with a filter for 2010)
>
> (I would also replicate the design docs, so that the per-year
> databases have the same functionality)
>
> Does this strategy make sense? Is there a better way? Any other ideas
> that I can take on how people are partitioning data with CouchDB?
>
> Thanks!
> Daniel Gonzalez
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message