incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joshua Bronson <jabron...@gmail.com>
Subject Re: Multipart MIME in dump tool
Date Fri, 19 Jun 2009 20:53:03 GMT
 On Thu, Jun 18, 2009 at 4:47 AM, Nils Breunese <n.breunese@vpro.nl> wrote:

> Joshua Bronson wrote:
>
>  I needed a script to dump a large (>30G) couchdb database on a nightly
>> basis
>> for backup purposes, to be performed while couchdb is running, (...)
>>
>
> Did you know that you can just use tools like cp to safely backup live
> CouchDB databases? Using rsync will give you an instant incremental backup
> tool.
>
> Nils Breunese.
>


Thanks for bringing this up. I was actually doing exactly that -- rsyncing
the .couch file -- before switching to json dumps. Here are the reasons I
switched:

  - The format of the .couch files can change from one version of couchdb to
another, so if you ever upgrade couchdb (which you probably will!), you'll
no longer be able to swap in the .couch files.

  - If the .couch file ever somehow gets corrupted, the corruption will
propagate to your backups. Nobody wants to suffer the fate of
ma.gnolia<http://corvusconsulting.ca/2009/02/ma-gnolias-bad-day/>
!

  - json is human-readable

  - It takes up less space, and can be further compressed to take up much
less. My 30G .couch file produced a 17G _all_docs_by_seq dump which then
bzip2-compressed to 2.6G.


And now with the latest version of
streamcouch.py<https://svn.openplans.org/melk/util/streamcouch.py>,
along with something like my new wrapper
script<https://svn.openplans.org/melk/util/backupcouch>
,

  - It does incremental backups too.


So far I like doing it this way a lot better. If anyone's had a chance to
give it a whirl, I'd love to hear about your experiences with it.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message