incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Cottlehuber <d...@muse.net.nz>
Subject Re: How to split the data over a period of time.
Date Mon, 11 Jun 2012 18:03:27 GMT
On 11 June 2012 17:42, Harikrishnan R <harikrishnan@inxsasia.com> wrote:
> Hi Dave,
>
>   Many thanks for your quick response.
>
>   I am not updating any documents, I am keep on appending docs to database
> with a specified timestamp.

OK, so its a continuing series of new docs going into the DB.

>   One of my requirements is *unique login counts* between two specified
> dates.
>
>   my map function will emits like
>
>   *{[2012, 6, 4], acc_id_1}*
> *   {[2012, 6, 4], acc_id_2}*
> *   {[2012, 6, 4], acc_id_3}*
> *   {[2012, 6, 5], acc_id_1}*
> *   {[2012, 6, 6], acc_id_4}*
> *   .....*
> *   ....*
>
>   By using start key and end key I am able to get the unique counts.

Are you aware you can use a reduce function and have couchdb manage
that for you?

>   Here the problem is when the dates range specified may fall into backup
> or in-between.

I don't follow. Are you using separate DBs per month, or some rotation scheme?

If you're not removing or aggregating data, then all the data will be in the DB.

>   1) How do I load backup db/views on-demand and merge the results.

This is a bit of a hand-wavey question so I'll try my best.

If you need to merge the views you'll need to merge the documents,
(e.g. replicate backup -> main DB, then regenerate).

Alternatively make your application aware and simply restore DB,
regenerate views if you can't afford the downtime, then have the app
work out how to consolidate the results.

If you clarify how you're managing your data a better answer will be
forthcoming.

>   2) Do I need a cronjob for taking the backup of views and db

I'd not backup views *unless* you cannot afford the build time in case
of a disaster *and* a replicated copy is not sufficient.

But yes, cron is fine for triggering a backup.

CoW - best explanation (of a memory model)
http://hackerboss.com/copy-on-write-101-part-1-what-is-it/ and
http://en.wikipedia.org/wiki/Snapshot_%28computer_storage%29 the
disk-based version.

Basically your filesystem stores multiple images of your filesystems,
transparently directing reads of unmodified blocks to the originals,
and sending writes to a new block. Thus the 90% of unchanged data
doesn't take up any more space. From a backup point of view, this
typically means that the backup to disk is super fast (less direct
application downtime for DBs that don't allow direct copy like CouchDB)
and on advanced/expensive disk arrays you can export that filesystem
to another host and back it up indirectly, without impacting the load
of the original volume.

A+
Dave

Mime
View raw message