incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harikrishnan R <harikrish...@inxsasia.com>
Subject Re: How to split the data over a period of time.
Date Mon, 11 Jun 2012 15:42:10 GMT
Hi Dave,

   Many thanks for your quick response.

   I am not updating any documents, I am keep on appending docs to database
with a specified timestamp.

   One of my requirements is *unique login counts* between two specified
dates.

   my map function will emits like

   *{[2012, 6, 4], acc_id_1}*
*   {[2012, 6, 4], acc_id_2}*
*   {[2012, 6, 4], acc_id_3}*
*   {[2012, 6, 5], acc_id_1}*
*   {[2012, 6, 6], acc_id_4}*
*   .....*
*   ....*

   By using start key and end key I am able to get the unique counts.

   Here the problem is when the dates range specified may fall into backup
or in-between.

   1) How do I load backup db/views on-demand and merge the results.

   2) Do I need a cronjob for taking the backup of views and db



*PN :  *Reader application is  communicating with couchdb using java APIs.


On Mon, Jun 11, 2012 at 6:57 PM, Dave Cottlehuber <dave@muse.net.nz> wrote:

> On 11 June 2012 14:50, Harikrishnan R <harikrishnan@inxsasia.com> wrote:
> > Hi,
> >
> >     My application uses couch-db for storing events. Daily on an average
> > my database size increases up-to 4GB.  So If I keep the database for 3-6
> > months my disk space
> > get reduced.
>
> Presumably this happens by CouchDB compaction? Or do you aggregate
> and then make a summary dataset yourself?
>
> > What is the best strategy for
> >
> >  1) taking the backup of couch-db after a period of time.
>
> This really depends on what you want to be able to restore, and how
> frequently you expect to do that.
>
> some options:
>
> backup of couchdb file only
>
> -> able to restore any/all docs from that time, and rebuild views from
> that.
>
> backup view & db files
>
> -> able to switch back to production-ready state without waiting for view
> rebuilds. This works nicely if you have CoW storage (snapshots).
>

    what is CoW storage ?

>

wire up _changes feed to a log file.
>
> -> able to retrieve state of doc changes throughout the time period.
>
> Consider that compaction will remove data from older revisions, keeping
> only the revision tree information, to save space.
>
> So if history of documents is important then you need to either track
> changes
> and store them separately, or consider if a simply backup prior to
> compaction
> is enough.
>
> >  2) how to retrieve the data from backup and from the current on demand.
>
> See above, depends on what / how you want to do. You can restore any
> couch db file and rename it, the docs will be accessible under the new
> filename.
>
> >  3) When to index the documents.
>
> Not clear what you mean here?


> A+
> Dave
>



-- 
-Regards
 Harikrishnan R

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message